Mammalian genes involved in viral infection and tumor suppression

ABSTRACT

The present invention provides methods of identifying cellular genes necessary for viral growth and cellular genes that function as tumor suppressors. Thus, the present invention provides nucleic acids related to and methods of reducing or preventing viral infection or cancer. The invention also provides methods of producing substantially virus-free cell cultures and methods for screening for additional such genes.

[0001] This application was made with partial government support underNational Institutes of Health Grant No. CA68283 and a grant from theDepartment of Veterans Affairs. The United States Government has somerights in the invention.

BACKGROUND

[0002] 1. Field of the Invention

[0003] The present invention provides methods of identifying cellulargenes used for viral growth or for tumor progression. Thus, the presentinvention relates to nucleic acids related to and methods of reducing orpreventing viral infection and for suppressing tumor progression. Theinvention also relates to methods for screening for additional suchgenes.

[0004] 2. Background Art

[0005] Various projects have been directed toward isolating andsequencing the genome of various animals, notably the human. However,most methodologies provide nucleotide sequences for which no function islinked or even suggested, thus limiting the immediate usefulness of suchdata.

[0006] The present invention, in contrast, provides methods of screeningonly for nucleic acids that are involved in a specific process, i.e.,viral infection or tumor progression, and further, for nucleic acidsuseful in treatments for these processes because by this method onlynucleic acids which are also nonessential to the cell are isolated. Suchmethods are highly useful, since they ascribe a function to eachisolated gene, and thus the isolated nucleic acids can immediately beutilized in various specific methods and procedures.

[0007] For, example, the present invention provides methods of isolatingnucleic acids encoding gene products used for viral infection, butnonessential to the cell. Viral infections of the intestine and liverare significant causes of human morbidity and mortality. Understandingthe molecular mechanisms of such infections will lead to new approachesin their treatment and control.

[0008] Viruses can establish a variety of types of infection. Theseinfections can be generally classified as lytic or persistent, thoughsome lytic infections are considered persistent. Generally, persistentinfections fall into two categories: (1) chronic (productive) infection,i.e., infection wherein infectious virus is present and can be recoveredby traditional biological methods and (2) latent infection, i.e.,infection wherein viral genome is present in the cell but infectiousvirus is generally not produced except during intermittent episodes ofreactivation. Persistence generally involves stages of both productiveand latent infection.

[0009] Lytic infections can also persist under conditions where only asmall fraction of the total cells are infected (smoldering (cycling)infection). The few infected cells release virus and are killed, but theprogeny virus again only infect a small number of the total cells.Examples of such smoldering infections include the persistence of lacticdehydrogenase virus in mice (Mahy, B. W. J., Br. Med. Bull. 41: 50-55(1985)) and adenovirus infection in humans (Porter, D. D. pp. 784-790 inBaron, S., ed. Medical Microbiology 2d ed. (Addison-Wesley, Menlo Park,Calif. 1985)).

[0010] Furthermore, a virus may be lytic for some cell types but not forothers. For example, evidence suggests that human immunodeficiency virus(HIV) is more lytic for T cells than for monocytes/macrophages, andtherefore can result in a productive infection of T cells that canresult in cell death, whereas HIV-infected mononuclear phagocytes mayproduce virus for considerable periods of time without cell lysis.(Klatzmann, et al. Science 225:59-62 (1984); Koyanagi, et al. Science241:1673-1675 (1988); Sattentau, et al. Cell 52:631-633 (1988)).

[0011] Traditional treatments for viral infection includepharmaceuticals aimed at specific virus derived proteins, such as HIVprotease or reverse transcriptase, or recombinant (cloned) immunemodulators (host derived), such as the interferons. However, the currentmethods have several limitations and drawbacks which include high ratesof viral mutations which render anti-viral pharmaceuticals ineffective.For immune modulators, limited effectiveness, limiting side effects, alack of specificity all limit the general applicability of these agents.Also the rate of success with current antivirals and immune-modulatorshas been disappointing.

[0012] The current invention focuses on isolating genes that are notessential for cellular survival when disrupted in one or both alleles,but which are required for virus replication. This may occur with a doseeffect, in which one allele knock-out may confer the phenotype of virusresistance for the cell. As targets for therapeutic intervention,inhibition of these cellular gene products, including: proteins, partsof proteins (modification enzymes that include, but are not restrictedto glycosylation, lipid modifiers [myriolate, etc.]), lipids,transcription elements and RNA regulatory molecules, may be less likelyto have profound toxic side effects and virus mutation is less likely toovercome the “block” to replicate successfully.

[0013] The present invention provides a significant improvement overprevious methods of attempted therapeutic intervention against viralinfection by addressing the cellular genes required by the virus forgrowth. Therefore, the present invention also provides an innovativetherapeutic approach to intervention in viral infection by providingmethods to treat viruses by inhibiting the cellular genes necessary forviral infection. Because these genes, by virtue of the means by whichthey are originally detected, are nonessential to the cell's survival,these treatment methods can be used in a subject without seriousdetrimental effects to the subject, as has been found with previousmethods. The present invention also provides the surprising discoverythat virally infected cells are dependent upon a factor in serum tosurvive. Therefore, the present invention also provides a method fortreating viral infection by inhibiting this serum survival factor.Finally, these discoveries also provide a novel method for removingvirally infected cells from a cell culture by removing, inhibiting ordisrupting this serum survival factor in the culture so thatnon-infected cells selectively survive.

[0014] The selection of tumor suppressor gene(s) has become an importantarea in the discovery of new target for therapeutic intervention ofcancer. Since the discovery that cells are restricted from promiscuousentry into the cell cycle by specific genes that are capable ofsuppressing a “transformed” phenotype, considerable time has beeninvested in the discovery of such genes. Some of these genes include thegene associated by rhabdomyosarcoma (Rb) and the p53 (apoptosis related)encoding gene. The present invention provides a method, usinggene-trapping, to select cell lines that have transformed phenotype fromcells that are not transformed and to isolate from these cells a genethat can suppress a malignant phenotype. Thus, by the nature of theisolation process, a function is associated with the isolated genes. Thecapacity to select quickly tumor suppressor genes can provide uniquetargets in the process of treating or preventing, and even fordiagnostic testing of, cancer.

DETAILED DESCRIPTION OF THE INVENTION

[0015] The present invention utilizes “gene trap” method along with aselection process to identify and isolate nucleic acids from genesassociated with a particular function. Specifically, it provides a meansof isolating cellular genes necessary for viral infection but notessential for the cell's survival, and it provides a means of isolatingcellular genes that suppress tumor progression.

[0016] The present invention also provides a core discovery that virallyinfected cells become dependent upon at least one factor present inserum for survival, whereas non-infected cells do not exhibit thisdependence. This core discovery has been utilized in the presentinvention in several ways. First, inhibition of the “serum survivalfactor” can be utilized to eradicate persistently virally infected cellsfrom populations of non-infected cells. Inhibition of this factor canalso be used to treat virus infection in a subject, as further describedherein. Additionally, inhibition of or withdrawal of the serum survivalfactor in tissue culture allows for the detection of cellular genesrequired for viral replication yet nonessential for an uninfected cellto survive. The present invention further provides several such cellulargenes, as well as methods of treating viral infections by inhibiting thefunctioning of such genes.

[0017] Furthermore, the present invention provides a method forisolation of cellular genes utilized in tumor progression.

[0018] The present method provides several cellular genes that arenecessary for viral growth in the cell but are not essential for thecell to survive. These genes are important for lytic and persistentinfection by viruses. These genes were isolated by generating gene traplibraries by infecting cells with a retrovirus gene trap vector,selecting for cells in which a gene trap event occurred (i.e., in whichthe vector had inserted such that the promoterless marker gene wasinserted such that a cellular promoter promotes transcription of themarker gene, i.e., inserted into a functioning gene), starving the cellsof serum, infecting the selected cells with the virus of choice whilecontinuing serum starvation, and adding back serum to allow visiblecolonies to develop, which colonies were cloned by limiting dilution.Genes into which the retrovirus gene trap vector inserted were thenisolated from the colonies using probes specific for the retrovirus genetrap vector. Thus nucleic acids isolated by this method are isolatedportions of genes.

[0019] Thus the present invention provides a method of identifying acellular gene necessary for viral growth in a cell and nonessential forcellular survival, comprising (a) transferring into a cell culturegrowing in serum-containing medium a vector encoding a selective markergene lacking a functional promoter, (b) selecting cells expressing themarker gene, (c) removing serum from the culture medium, (d) infectingthe cell culture with the virus, and (e) isolating from the survivingcells a cellular gene within which the marker gene is inserted, therebyidentifying a gene necessary for viral growth in a cell and nonessentialfor cellular survival. The present invention also provides a method ofidentifying a cellular gene used for viral growth in a cell andnonessential for cellular survival, comprising (a) transferring into acell culture growing in serum-containing medium a vector encoding aselective marker gene lacking a functional promoter, (b) selecting cellsexpressing the marker gene, (c) removing serum from the culture medium,(d) infecting the cell culture with the virus, and (e) isolating fromthe surviving cells a cellular gene within which the marker gene isinserted, thereby identifying a gene necessary for viral growth in acell and nonessential for cellular survival. In any selected cell type,such as Chinese hamster ovary cells, one can readily determine if serumstarvation is required for selection. If it is not, serum starvation maybe eliminated from the steps.

[0020] Alternatively, instead of removing serum from the culture medium,a serum factor required by the virus for growth can be inhibited, suchas by the administration of an antibody that specifically binds thatfactor. Furthermore, if it is believed that there are no persistentlyinfected cells in the culture, the serum starvation step can beeliminated and the cells grown in usual medium for the cell type. Ifserum starvation is used, it can be continued for a time after theculture is infected with the virus. Serum can then be added back to theculture. If some other method is used to inactivate the factor, it canbe discontinued, inactivated or removed (such as removing theanti-factor antibody, e.g., with a bound antibody directed against thatantibody) prior to adding fresh serum back to the culture. Cells thatsurvive are mutants having an inactivating insertion in a gene necessaryfor growth of the virus. The genes having the insertions can then beisolated by isolating sequences having the marker gene sequences. Thismutational process disturbs a wild type function. A mutant gene mayproduce at a lower level a normal product, it may produce a normalproduct not normally found in these cells, it may cause theoverproduction of a normal product, it may produce an altered productthat has some functions but not others, or it may completely disrupt agene function. Additionally, the mutation may disrupt an RNA that has afunction but is never translated into a protein. For example, thealpha-tropomyosin gene has a 3′ RNA that is very important in cellregulation but never is translated into protein. (Cell 75 pg 1107-1117,12/17/93).

[0021] As used herein, a cellular gene “nonessential for cellularsurvival” means a gene for which disruption of one or both allelesresults in a cell viable for at least a period of time which allowsviral replication to be inhibited for preventative or therapeutic usesor use in research. A gene “necessary for viral growth” means the geneproduct, either protein or RNA, secreted or not, is necessary, eitherdirectly or indirectly in some way for the virus to grow, and therefore,in the absence of that gene product (i.e., a functionally available geneproduct), at least some of the cells containing the virus die. Forexample, such genes can encode cell cycle regulatory proteins, proteinsaffecting the vacuolar hydrogen pump, or proteins involved in proteinfolding and protein modification, including but not limited to:phosphorylation, methylation, glycosylation, myrislation or other lipidmoiety, or protein processing via enzymatic processing. Some examples ofsuch genes are exemplified herein, wherein some of the isolated nucleicacids correspond to genes such as vacuolar H+ATPase, alpha tropomyosin,gas5 gene, ras complex, N-acetyl-glucosaminyltransferase I mRNA, andcalcyclin.

[0022] Any virus capable of infecting the cell can be used for thismethod. Virus can be selected based upon the particular infectiondesired to study. However, it is contemplated by the present inventionthat many viruses will be dependent upon the same cellular genes forsurvival; thus a cellular gene isolated using one virus can be used as atarget for therapy for other viruses as well. Any cellular gene can betested for relevancy to any desired virus using the methods set forthherein, i.e., in general, by inhibiting the gene or its gene product ina cell and determining if the desired virus can grow in that cell. Someexamples of viruses include HIV (including HIV-1 and HIV-2); parvovirus;papillomaviruses; hantaviruses; influenza viruses (e.g., influenza A, Band C viruses); hepatitis viruses A to G; caliciviruses; astroviruses;rotaviruses; coronaviruses, such as human respiratory coronavirus;picornaviruses, such as human rhinovirus and enterovirus; ebola virus;human herpesvirus (e.g., HSV-1-9); human cytomegalovirus; humanadenovirus; Epstein-Barr virus; hantaviruses; for animal, the animalcounterpart to any above listed human virus, animal retroviruses, suchas simian immunodeficiency virus, avian immunodeficiency virus, bovineimmunodeficiency virus, feline immunodeficiency virus, equine infectiousanemia virus, caprine arthritis encephalitis virus or visna virus.

[0023] The nucleic acids comprising cellular genes of this inventionwere isolated by the above method and as set forth in the examples. Theinvention includes a nucleic acid comprising the nucleotide sequence setforth in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ IDNO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ IDNO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ IDNO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ IDNO:24, SEQ ID NO:25, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ IDNO:29, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ IDNO:34, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ IDNO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ IDNO:44, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47, SEQ ID NO:48, SEQ IDNO:49, SEQ ID NO:50, SEQ ID NO:51, SEQ ID NO:52, SEQ ID NO:53, SEQ IDNO:54, SEQ ID NO:55, SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ IDNO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ IDNO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ ID NO:67, SEQ ID NO:68, SEQ IDNO:69, SEQ ID NO:70, SEQ ID NO:71, SEQ ID NO:72, SEQ ID NO:73, SEQ IDNO:74 or SEQ ID NO:75 (this list is sometimes referred to herein as ASEQID NO:5 through SEQ ID NO:75” for brevity). Thus these nucleic acids cancontain, in addition to the nucleotides set forth in each SEQ ID NO inthe sequence listing, additional nucleotides at either end of themolecule. Such additional nucleotides can be added by any standardmethod, as known in the art, such as recombinant methods and synthesismethods. Examples of such nucleic acids comprising the nucleotidesequence set forth in any entry of the sequence listing contemplated bythis invention include, but are not limited to, for example, the nucleicacid placed into a vector; a nucleic acid having one or more regulatoryregion (e.g., promoter, enhancer, polyadenylation site) linked to it,particularly in functional manner, i.e. such that an mRNA or a proteincan be produced; a nucleic acid including additional nucleic acids ofthe gene, such as a larger or even full length genomic fragment of thegene, a partial or full length cDNA, a partial or full length RNA.Making and/or isolating such larger nucleic acids is further describedbelow and is well known and standard in the art.

[0024] The invention also provides a nucleic acid encoding the proteinencoded by the gene comprising the nucleotide sequence set forth in SEQID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ IDNO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ IDNO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ IDNO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ IDNO:25, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ IDNO:30, SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ IDNO:35, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ IDNO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ IDNO:45, SEQ ID NO:46, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ IDNO:50, SEQ ID NO:51, SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:54, SEQ IDNO:55, SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ IDNO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ IDNO:65, SEQ ID NO:66, SEQ ID NO:67, SEQ ID NO:68, SEQ ID NO:69, SEQ IDNO:70, SEQ ID NO:71, SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74 or SEQ IDNO:75, as well as allelic variants and homologs of each such gene. Thegene is readily obtained using standard methods, as described below andas is known and standard in the art. The present invention alsocontemplates any unique fragment of these genes or of the nucleic acidsset forth in any of SEQ ID NO:5 through SEQ ID NO:75. Examples ofinventive fragments of the inventive genes are the nucleic acids whosesequence is set forth in any of SEQ ID NO:5 through SEQ ID NO:75. To beunique, the fragment must be of sufficient size to distinguish it fromother known sequences, most readily determined by comparing any nucleicacid fragment to the nucleotide sequences of nucleic acids in computerdatabases, such as GenBank. Such comparative searches are standard inthe art. Typically, a unique fragment useful as a primer or probe willbe at least about 20 to about 25 nucleotides in length, depending uponthe specific nucleotide content of the sequence. Additionally, fragmentscan be, for example, at least about 30, 40, 50, 75, 100, 200 or 500nucleotides in length. The nucleic acids can be single or doublestranded, depending upon the purpose for which it is intended.

[0025] The present invention further provides a nucleic acid comprisingthe regulatory region of a gene comprising the nucleotide sequences setforth in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ IDNO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ IDNO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ IDNO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ IDNO:24, SEQ ID NO:25, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ IDNO:29, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ IDNO:34, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ IDNO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ IDNO:44, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47, SEQ ID NO:48, SEQ IDNO:49, SEQ ID NO:50, SEQ ID NO:51, SEQ ID NO:52, SEQ ID NO:53, SEQ IDNO:54, SEQ ID NO:55, SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ IDNO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ IDNO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ ID NO:67, SEQ ID NO:68, SEQ IDNO:69, SEQ ID NO:70, SEQ ID NO:71, SEQ ID NO:72, SEQ ID NO:73, SEQ IDNO:74, SEQ ID NO:75. Additionally provided is a construct comprisingsuch a regulatory region functionally linked to a reporter gene. Suchreporter gene constructs can be used to screen for compounds andcompositions that affect expression of the gene comprising the nucleicacids whose sequence is set forth in any of SEQ ID NO:5 through SEQ IDNO:75. The nucleic acids set forth in the sequence listing are genefragments; the entire coding sequence and the entire gene that compriseseach fragment are both contemplated herein and are readily obtained bystandard methods, given the nucleotide sequences presented in thesequence listing (see. e.g., Sambrook et al., Molecular Cloning: ALaboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold SpringHarbor, N.Y., 1989; DNA cloning: A Practical Approach, Volumes I and II,Glover, D. M. ed., IRL Press Limited, Oxford, 1985). To obtain theentire genomic gene, briefly, a nucleic acid whose sequence is set forthin any of SEQ ID NO:1 through SEQ ID NO:83, or preferably in any of SEQID NO:5 through SEQ ID NO:83, or a smaller fragment thereof, is utilizedas a probe to screen a genomic library under high stringency conditions,and isolated clones are sequenced. Once the sequence of the new clone isdetermined, a probe can be devised from a portion of the new clone notpresent in the previous fragment and hybridized to the library toisolate more clones containing fragments of the gene. In this manner, byrepeating this process in organized fashion, one can “walk” along thechromosome and eventually obtain nucleotide sequence for the entiregene. Similarly, one can use portions of the present fragments, oradditional fragments obtained from the genomic library, that containopen reading frames to screen a cDNA library to obtain a cDNA having theentire coding sequence of the gene. Repeated screens can be utilized asdescribed above to obtain the complete sequence from several clones ifnecessary. The isolates can then be sequenced to determine thenucleotide sequence by standard means such as dideoxynucleotidesequencing methods (see, e.g., Sambrook et al., Molecular Cloning: ALaboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold SpringHarbor, N.Y., 1989).

[0026] The present genes were isolated from rat; however, homologs inany desired species, preferably mammalian, such as human, can readily beobtained by screening a human library, genomic or cDNA, with a probecomprising sequences of the nucleic acids set forth in the sequencelisting herein, or fragments thereof, and isolating genes specificallyhybridizing with the probe under preferably relatively high stringencyhybridization conditions. For example, high salt conditions (e.g., in 6×SSC or 6× SSPE) and/or high temperatures of hybridization can be used.For example, the stringency of hybridization is typically about 5° C. to20° C. below the T_(m) (the melting temperature at which half of themolecules dissociate from its partner) for the given chain length. As isknown in the art, the nucleotide composition of the hybridizing regionfactors in determining the melting temperature of the hybrid. For 20merprobes, for example, the recommended hybridization temperature istypically about 55-58° C. Additionally, the rat sequence can be utilizedto devise a probe for a homolog in any specific animal by determiningthe amino acid sequence for a portion of the rat protein, and selectinga probe with optimized codon usage to encode the amino acid sequence ofthe homolog in that particular animal. Any isolated gene can beconfirmed as the targeted gene by sequencing the gene to determine itcontains the nucleotide sequence listed herein as comprising the gene.Any homolog can be confirmed as a homolog by its functionality.

[0027] Additionally contemplated by the present invention are nucleicacids, from any desired species, preferably mammalian and morepreferably human, having 98%, 95%, 90%, 85%, 80%, 70%, 60%, or 50%homology, or greater, in the region of homology, to a region in an exonof a nucleic acid encoding the protein encoded by the gene comprisingthe nucleotide sequence set forth in any of SEQ ID NO:5 through SEQ IDNO:75 of the sequence listing or to homologs thereof. Also contemplatedby the present invention are nucleic acids, from any desired species,preferably mammalian and more preferably human, having 98%, 95%, 90%,85%, 80%, 70%, 60%, or 50% homology, or greater, in the region ofhomology, to a region in an exon of a nucleic acid comprising thenucleotide sequence set forth in any of SEQ ID NO:5 through SEQ ID NO:75of the sequence listing or to homologs thereof. These genes can besynthesized or obtained by the same methods used to isolate homologs,with stringency of hybridization and washing, if desired, reducedaccordingly as homology desired is decreased, and further, dependingupon the G-C or A-T richness of any area wherein variability is searchedfor. Allelic variants of any of the present genes or of their homologscan readily be isolated and sequenced by screening additional librariesfollowing the protocol above. Methods of making synthetic genes aredescribed in U.S. Pat. No. 5,503,995 and the references cited therein.

[0028] The nucleic acid encoding any selected protein of the presentinvention can be any nucleic acid that functionally encodes thatprotein. For example, to functionally encode, i.e., allow the nucleicacid to be expressed, the nucleic acid can include, for example,exogenous or endogenous expression control sequences, such as an originof replication, a promoter, an enhancer, and necessary informationprocessing sites, such as ribosome binding sites, RNA splice sites,polyadenylation sites, and transcriptional terminator sequences.Preferred expression control sequences can be promoters derived frommetallothionine genes, actin genes, immunoglobulin genes, CMV, SV40,adenovirus, bovine papilloma virus, etc. Expression control sequencescan be selected for functionality in the cells in which the nucleic acidwill be placed. A nucleic acid encoding a selected protein can readilybe determined based upon the amino acid sequence of the selectedprotein, and, clearly, many nucleic acids will encode any selectedprotein.

[0029] The present invention additionally provides a nucleic acid thatselectively hybridizes under stringent conditions with a nucleic acidencoding the protein encoded by the gene comprising the nucleotidesequence set forth in any sequence listed herein (i.e., any of SEQ IDNO:5 through SEQ ID NO:75). This hybridization can be specific. Thedegree of complementarity between the hybridizing nucleic acid and thesequence to which it hybridizes should be at least enough to excludehybridization with a nucleic acid encoding an unrelated protein. Thus, anucleic acid that selectively hybridizes with a nucleic acid of thepresent protein coding sequence will not selectively hybridize understringent conditions with a nucleic acid for a different, unrelatedprotein, and vice versa. Typically, the stringency of hybridization toachieve selective hybridization involves hybridization in high ionicstrength solution (6× SSC or 6× SSPE) at a temperature that is about12-25° C. below the Tm (the melting temperature at which half of themolecules dissociate from its partner) followed by washing at acombination of temperature and salt concentration chosen so that thewashing temperature is about 5° C. to 20° C. below the Tm of the hybridmolecule. The temperature and salt conditions are readily determinedempirically in preliminary experiments in which samples of reference DNAimmobilized on filters are hybridized to a labeled nucleic acid ofinterest and then washed under conditions of different stringencies.Hybridization temperatures are typically higher for DNA-RNA and RNA-RNAhybridizations. The washing temperatures can be used as described aboveto achieve selective stringency, as is known in the art. (Sambrook etal., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring HarborLaboratory, Cold Spring Harbor, N.Y., 1989; Kunkel et al. MethodsEnzymol. 1987:154:367, 1987). Nucleic acid fragments that selectivelyhybridize to any given nucleic acid can be used, e.g., as primers and orprobes for further hybridization or for amplification methods (e.g.,polymerase chain reaction (PCR), ligase chain reaction (LCR)). Apreferable stringent hybridization condition for a DNA:DNA hybridizationcan be at about 68° C. (in aqueous solution) in 6× SSC or 6× SSPEfollowed by washing at 68° C.

[0030] The present invention additionally provides a protein encoded bya nucleic acid encoding the protein encoded by the gene comprising anyof the nucleotide sequences set forth herein (i.e., any of SEQ ID NO: 5through SEQ ID NO:75). The protein can be readily obtained by any ofseveral means. For example, the nucleotide sequence of coding regions ofthe gene can be translated and then the corresponding polypeptide can besynthesized mechanically by standard methods. Additionally, the codingregions of the genes can be expressed or synthesized, an antibodyspecific for the resulting polypeptide can be raised by standard methods(see, e.g., Harlow and Lane, Antibodies: A Laboratory Manual, ColdSpring Harbor Laboratory, Cold Spring Harbor, N.Y., 1988), and theprotein can be isolated from other cellular proteins by selectivehybridization with the antibody. This protein can be purified to theextent desired by standard methods of protein purification (see, e.g.,Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., ColdSpring Harbor Laboratory, Cold Spring Harbor, N.Y., 1989). The aminoacid sequence of any protein, polypeptide or peptide of this inventioncan be deduced from the nucleic acid sequence, or it can be determinedby sequencing an isolated or recombinantly produced protein.

[0031] The terms “peptide, “polypeptide” and “protein” are usedinterchangeably herein and refer to a polymer of amino acids andincludes full-length proteins and fragments thereof. As used in thespecification and in the claims, “a” can mean one or more, dependingupon the context in which it is used. An amino acid residue is an aminoacid formed upon chemical digestion (hydrolysis) of a polypeptide at itspeptide linkages. The amino acid residues described herein arepreferably in the “L” isomeric form. However, residues in the “D”isomeric form can be substituted for any L-amino acid residue, as longas the desired functional property is retained by the polypeptide.Standard polypeptide nomenclature (described in J. Biol. Chem.,243:3552-59 (1969) and adopted at 37 CFR §1.822(b)) is used herein.

[0032] As will be appreciated by those skilled in the art, the inventionalso includes those polypeptides having slight variations in amino acidsequences or other properties. Amino acid substitutions can be selectedby known parameters to be neutral (see, e.g., Robinson W E Jr, andMitchell W M., AIDS 4:S151-S162(1990)). Such variations may arisenaturally as allelic variations (e.g., due to genetic polymorphism) ormay be produced by human intervention (e.g., by mutagenesis of clonedDNA sequences), such as induced point, deletion, insertion andsubstitution mutants. Minor changes in amino acid sequence are generallypreferred, such as conservative amino acid replacements, small internaldeletions or insertions, and additions or deletions at the ends of themolecules. Substitutions may be designed based on, for example, themodel of Dayhoff, et al. (in Atlas of Protein Sequence and Structure1978, Nat'l Biomed. Res. Found, Washington, D.C.). These modificationscan result in changes in the amino acid sequence, provide silentmutations, modify a restriction site, or provide other specificmutations. Likewise, such amino acid changes result in a differentnucleic acid encoding the polypeptides and proteins. Thus, alternativenucleic acids are also contemplated by such modifications.

[0033] The present invention also provides cells containing a nucleicacid of the invention. A cell containing a nucleic acid encoding aprotein typically can replicate the DNA and, further, typically canexpress the encoded protein. The cell can be a prokaryotic cell,particularly for the purpose of producing quantities of the nucleicacid, or a eukaryotic cell, particularly a mammalian cell. The cell ispreferably a mammalian cell for the purpose of expressing the encodedprotein so that the resultant produced protein has mammalian proteinprocessing modifications.

[0034] Nucleic acids of the present invention can be delivered intocells by any selected means, in particular depending upon the purpose ofthe delivery of the compound and the target cells. Many delivery meansare well-known in the art. For example, electroporation, calciumphosphate precipitation, microinjection, cationic or anionic liposomes,and liposomes in combination with a nuclear localization signal peptidefor delivery to the nucleus can be utilized, as is known in the art.

[0035] The present invention also contemplates that the mutated cellulargenes necessary for viral growth, produced by the present method, aswell as cells containing these mutants can also be useful. These mutatedgenes and cells containing them can be isolated and/or producedaccording to the methods herein described and using standard methods.

[0036] It should be recognized that the sequences set forth herein maycontain minor sequencing errors. Such errors can be corrected, forexample, by using the hybridization procedure described above withvarious probes derived from the described sequences such that the codingsequence can be reisolated and resequenced.

[0037] As described in the examples, the present invention provides thediscovery of a “serum survival factor” present in serum that isnecessary for the survival of persistently virally infected cells.Isolation and characterization of this factor have shown it to be aprotein, to have a molecular weight of between about 50 kD and 100 kD,to resist inactivation in low pH (e.g., pH 2) and chloroform extraction,to be inactivated by boiling for about 5 minutes and in low ionicstrength solution (e.g., about 10 mM to about 50 mM). The presentinvention thus provides a purified mammalian serum protein having amolecular weight of between about 50 kD and 100 kD which resistsinactivation in low pH and resists inactivation by chloroformextraction, which inactivates when boiled and inactivates in low ionicstrength solution, and which when removed from a cell culture comprisingcells persistently infected with reovirus selectively substantiallyprevents survival of cells persistently infected with reovirus. Thefactor, fitting the physical characteristics described above, canreadily be verified by adding it to non-serum-containing medium (whichpreviously could not support survival of persistently virally infectedcells) and determining whether this medium with the added putativefactor can now support persistently virally infected cells, particularlycells persistently infected with reovirus. As used herein, a “purified”protein means the protein is at least of sufficient purity such that anapproximate molecular weight can be determined.

[0038] The amino acid sequence of the protein can be elucidated bystandard methods. For example, an antibody to the protein can be raisedand used to screen an expression library to obtain nucleic acid sequencecoding the protein. This nucleic acid sequence is then simply translatedinto the corresponding amino acid sequence. Alternatively, a portion ofthe protein can be directly sequenced by standard amino acid sequencingmethods (amino-terminus sequencing). This amino acid sequence can thenbe used to generate an array of nucleic acid probes that encompasses allpossible coding sequences for a portion of the amino acid sequence. Thearray of probes is used to screen a cDNA library to obtain the remainderof the coding sequence and thus ultimately the corresponding amino acidsequence.

[0039] The present invention also provides methods of detecting andisolating additional serum survival factors. For example, to determineif any known serum components are necessary for viral growth, the knowncomponents can be inhibited in, or eliminated from, the culture medium,and it can be observed whether viral growth is inhibited by determiningif persistently infected cells do not survive. One can add the factorback (or remove the inhibition) and determine whether the factor allowsfor viral growth.

[0040] Additionally, other, unknown serum components can also be foundto be essential for viral growth. Serum can be fractionated by variousstandard means, and fractions added to serum free medium to determine ifa factor is present in a reaction that allows viral growth previouslyinhibited by the lack of serum. Fractions having this activity can thenbe further fractionated until the factor is relatively free of othercomponents. The factor can then be characterized by standard methods,such as size fractionation, denaturation and/or inactivation by variousmeans, etc. Preferably, once the factor has been purified to a desiredlevel of purity, it is added to cells in serum free medium to confirmthat it bestows the function of allowing virus to grow when serum-freemedium alone did not. This method can be repeated to confirm therequirement for the specific factor for any desired virus, since eachserum factor found to be required by any one virus can also be requiredby many other viruses. In general, the closer the viruses are relatedand the more similar the infection modes of the viruses, the more likelythat a factor required by one virus will be required by the other.

[0041] The present invention also provides methods of treating virusinfections utilizing applicants' discoveries. The subject of any of theherein described methods can be any animal, preferably a mammal, such asa human, a veterinary animal, such as a cat, dog, horse, pig, goat,sheep, or cow, or a laboratory animal, such as a mouse, rat, rabbit, orguinea pig, depending upon the virus.

[0042] The present invention provides a method of reducing orinhibiting, and thereby treating, a viral infection in a subject,comprising administering to the subject an inhibiting amount of acomposition that inhibits functioning of the serum protein describedherein, i.e. the serum protein having a molecular weight of betweenabout 50 kD and 100 kD which resists inactivation in low pH and resistsinactivation by chloroform extraction, which inactivates when boiled andinactivates in low ionic strength solution, and which when removed froma cell culture comprising cells persistently infected with the virusprevents survival of at least some cells persistently infected with thevirus, thereby treating the viral infection. The composition cancomprise, for example, an antibody that specifically binds the serumprotein, or an antisense RNA that binds an RNA encoded by a genefunctionally encoding the serum protein

[0043] Any virus capable of infecting the selected subject to be treatedcan be treated by the present method. As described above, any serumprotein or survival factor found by the present methods to be necessaryfor growth of any one virus can be found to be necessary for growth ofmany other viruses. For any given virus, the serum protein or factor canbe confirmed to be required for growth by the methods described herein.The cellular genes identified by the examples using reovirus, amammalian pathogen, and a rat cell system have general applicability toother virus infections that include all of the known as well as yet tobe discovered human pathogens, including, but not limited to: humanimmunodeficiency viruses (e.g., HIV-1, HIV-2); parvovirus;papillomaviruses; hantaviruses; influenza viruses (e.g., influenza A, Band C viruses); hepatitis viruses A to G; caliciviruses; astroviruses;rotaviruses; coronaviruses, such as human respiratory coronavirus;picornaviruses, such as human rhinovirus and enterovirus; ebola virus;human herpesvirus (e.g., HSV-1-9); human cytomegalovirus; humanadenovirus; Epstein-Barr virus; hantaviruses; for animal, the animalcounterpart to any above listed human virus, animal retroviruses, suchas simian immunodeficiency virus, avian immunodeficiency virus, bovineimmunodeficiency virus, feline immunodeficiency virus, equine infectiousanemia virus, caprine arthritis encephalitis virus or visna virus.

[0044] A protein inhibiting amount of the composition can be readilydetermined, such as by administering varying amounts to cells or to asubject and then adjusting the effective amount for inhibiting theprotein according to the volume of blood or weight of the subject.Compositions that bind to the protein can be readily determined byrunning the putatively bound protein on a protein gel and observing analteration in the protein's migration through the gel. Inhibition of theprotein can be determined by any desired means such as adding theinhibitor to complete media used to maintain persistently infected cellsand observing the cells' viability. The composition can comprise, forexample, an antibody that specifically binds the serum protein. Specificbinding by an antibody means that the antibody can be used toselectively remove the factor from serum or inhibit the factor'sbiological activity and can readily be determined by radio immune assay(RIA), bioassay, or enzyme-linked immunosorbant (ELISA) technology. Thecomposition can comprise, for example, an antisense RNA thatspecifically binds an RNA encoded by the gene encoding the serumprotein. Antisense RNAs can be synthesized and used by standard methods(e.g., Antisense RNA and DNA, D. A. Melton, Ed., Cold Spring HarborLaboratory, Cold Spring Harbor, N.Y. (1988)).

[0045] The present methods provide a method of screening a compound fortreating a viral infection, comprising administering the compound to acell containing a cellular gene functionally encoding a gene productnecessary for reproduction of the virus in the cell but not necessaryfor survival of the cell and detecting level of the gene productproduced, a decrease or elimination of the gene product indicating acompound for treating the viral infection. The present methods alsoprovide a method of screening a compound for effectiveness in treating aviral infection, comprising administering the compound to a cellcontaining a cellular gene functionally encoding a gene productnecessary for reproduction of the virus in the cell but not necessaryfor survival of the cell and detecting the level of the gene productproduced, a decrease or elimination of the gene product indicating acompound effective for treating the viral infection. The cellular genecan be, for example, any gene provided herein, i.e., any of the genescomprising the nucleotide sequences set forth in any of SEQ ID NO:1through SEQ ID NO:75, or any other gene obtained using the methodsprovided herein for obtaining such genes. Level of the gene product canbe measured by any standard means, such as by detection with an antibodyspecific for the protein. The level of gene product can be compared tothe level of the gene product in a control cell not contacted with thecompound. The level of gene product can be compared to the level of thegene product in the same cell prior to addition of the compound.Relatedly, the regulatory region of the gene can be functionally linkedto a reporter gene and compounds can be screened for inhibition of thereporter gene. Such reporter constructs are described herein.

[0046] The present invention provides a method of selectivelyeliminating cells persistently infected with a virus from an animal cellculture capable of surviving for a first period of time in the absenceof serum, comprising propagating the cell culture in the absence ofserum for a second time period which a persistently infected cell cannotsurvive without serum, thereby selectively eliminating from the cellculture cells persistently infected with the virus. The second timeperiod should be shorter than the first time period. Thus one can simplyeliminate serum from a standard culture medium composition for a periodof time (e.g. by removing serum containing medium from the culturecontainer, rinsing the cells, and adding serum-free medium back to thecontainer), then, after a time of serum starvation, return serum to theculture medium. Alternatively, one can inhibit a serum survival factorfrom the culture in place of the step of serum starvation. Furthermore,one can instead interfere with the virus-factor interaction. Such aviral elimination method can periodically be performed for culturedcells to ensure that they remain virus-free. The time period of serumremoval can greatly vary, with a typical range being about 1 to about 30days; a preferable period can be about 3 to about 10 days, and a morepreferable period can be about 5 days to about 7 days. This time periodcan be selected based upon ability of the specific cell to survivewithout serum as well as the life cycle of the virus, e.g., forreovirus, which has a life cycle of about 24 hours, 3 days' starvationof cells provides dramatic results.

[0047] Furthermore, the time period can be shortened by also passagingthe cells during the starvation; in general, increasing the number ofpassages can decrease the time of serum starvation (or serum factorinhibition) needed to get full clearance of the virus from the culture.While passaging, the cells typically are exposed briefly to serum(typically for about 3 to about 24 hours). This exposure both stops theaction of the trypsin used to dislodge the cells and stimulates thecells into another cycle of growth, thus aiding in this selectionprocess. Thus a starvation/serum cycle can be repeated to optimize theselective effect. Other standard culture parameters, such as confluencyof the cultures, pH, temperature, etc. can be varied to alter the neededtime period of serum starvation (or serum survival factor inhibition).This time period can readily be determined for any given viral infectionby simply removing the serum for various periods of time, then testingthe cultures for the presence of the infected cells (e.g., by ability tosurvive in the absence of serum and confirmed by quantitating virus incells by standard virus titration and immunohistochemical techniques) ateach tested time period, and then detecting at which time periods ofserum deprivation the virally infected cells were eliminated. It ispreferable that shorter time periods of serum deprivation that stillprovide elimination of the persistently infected cells be used.Furthermore, the cycle of starvation, then adding back serum anddetermining amount of virus remaining in the culture can be repeateduntil no virtually infected cells remain in the culture.

[0048] Thus, the present method can further comprise passaging thecells, i.e., transferring the cell culture from a first container to asecond container. Such transfer can facilitate the selective lack ofsurvival of virally infected cells. Transfer can be repeated severaltimes. Transfer is achieved by standard methods of tissue culture (see,e.g., Freshney, Culture of Animal Cells, A Manual of Basic Technique,2nd Ed. Alan R. Liss, Inc., New York, 1987).

[0049] The present method further provides a method of selectivelyeliminating from a cell culture cells persistently infected with avirus, comprising propagating the cell culture in the absence of afunctional form of the serum protein having a molecular weight ofbetween about 50 kD and 100 kD which resists inactivation in low pH andresists inactivation by chloroform extraction, which inactivates whenboiled and inactivates in low ionic strength solution, and which whenremoved from a cell culture comprising cells persistently infected withreovirus substantially prevents survival of cells persistently infectedwith reovirus. The absence of the functional form can be achieved by anyof several standard means, such as by binding the protein to an antibodyselective for it (binding the antibody in serum either before or afterthe serum is added to the cells; if before, the serum protein can beremoved from the serum by, e.g., binding the antibody to a column andpassing the serum over the column and then administering the survivalprotein-free serum to the cells), by administering a compound thatinactivates the protein, or by administering a compound that interfereswith the interaction between the virus and the protein.

[0050] Thus, the present invention provides a method of selectivelyeliminating from a cell culture propagated in serum-containing mediumcells persistently infected with a virus, comprising inhibiting in theserum the protein having a molecular weight of between about 50 kD and100 kD which resists inactivation in low pH and resists inactivation bychloroform extraction, which inactivates when boiled and inactivates inlow ionic strength solution, and which when removed from a cell culturecomprising cells persistently infected with reovirus substantiallyprevents survival of cells persistently infected with reovirus.Alternatively, the interaction between the virus and the serum proteincan be disrupted to selectively eliminate cells persistently infectedwith the virus.

[0051] Any virus capable of some form of persistent infection may beeliminated from a cell culture utilizing the present eliminationmethods, including removing, inhibiting or otherwise interfering with aserum protein, such as the one exemplified herein, and also includingremoving, inhibiting or otherwise interfering with a gene product fromany cellular gene found by the present method to be necessary for viralgrowth yet nonessential to the cell. For example, DNA viruses or RNAviruses can be targeted. One can readily determine whether cellsinfected with a selected virus can be selectively removed from a culturethrough removal of serum by starving cells permissive to the virus ofserum (or inhibiting the serum survival factor), adding the selectedvirus to the cells, adding serum to the culture, and observing whetherinfected cells die (i.e., by titering levels of virus in the survivingcells with an antibody specific for the virus).

[0052] A culture of any animal cell (i.e., any cell that is typicallygrown and maintained in culture in serum) that can be maintained for aperiod of time in the absence of serum, can be purified from viralinfection utilizing the present method. For example, primary cultures aswell as established cultures and cell lines can be used. Furthermore,cultures of cells from any animal and any tissue or cell type withinthat animal that can be cultured and that can be maintained for a periodof time in the absence of serum can be used. For example, cultures ofcells from tissues typically infected, and particularly persistentlyinfected, by an infectious virus could be used.

[0053] As used in the claims “in the absence of serum” means at a levelat which persistently virally infected cells do not survive. Typically,the threshold level is about 1% serum in the media. Therefore, about 1%serum or less can be used, such as about 1%, 0.75%, 0.50%. 0.25% 0.1% orno serum can be used.

[0054] As used herein, “selectively eliminating” cells persistentlyinfected with a virus means that substantially all of the cellspersistently infected with the virus are killed such that the presenceof virally infected cells cannot be detected in the culture immediatelyafter the elimination procedure has been performed. Furthermore,“selectively eliminating” includes that cells not infected with thevirus are generally not killed by the method. Some surviving cells maystill produce virus but at a lower level, and some may be defective inpathways that lead to death by the virus. Typically, for cellspersistently infected with virus to be substantially all killed, morethan about 90% of the cells, and more preferably less than about 95%,98%, 99%, or 99.99% of virus-containing cells in the culture are killed.

[0055] The present method also provides a nucleic acid comprising theregulatory region of any of the genes. Such regulatory regions can beisolated from the genomic sequences isolated and sequenced as describedabove and identified by any characteristics observed that arecharacteristic for regulatory regions of the species and by theirrelation to the start codon for the coding region of the gene. Thepresent invention also provides a construct comprising the regulatoryregion functionally linked to a reporter gene. Such constructs are madeby routine subcloning methods, and many vectors are available into whichregulatory regions can be subcloned upstream of a marker gene. Markergenes can be chosen for ease of detection of marker gene product.

[0056] The present method therefore also provides a method of screeninga compound for treating a viral infection, comprising administering thecompound to a cell containing any of the above-described constructs,comprising a regulatory region of one of the genes comprising thenucleotide sequence set forth in any of SEQ ID NO:1 through SEQ ID NO:75functionally linked to a reporter gene, and detecting the level of thereporter gene product produced, a decrease or elimination of thereporter gene product indicating a compound for treating the viralinfection. Compounds detected by this method would inhibit transcriptionof the gene from which the regulatory region was isolated, and thus, intreating a subject, would inhibit the production of the gene productproduced by the gene, and thus treat the viral infection.

[0057] The present invention additionally provides a method of reducingor inhibiting a viral infection in a subject, comprising administeringto the subject an amount of a composition that inhibits expression orfunctioning of a gene product encoded by a gene comprising the nucleicacid set forth in any of SEQ ID NO:1 through SEQ ID NO:75, or a homologthereof, thereby treating the viral infection. the composition cancomprise, for example, an antibody that binds a protein encoded by thegene. The composition can also comprise an antibody that binds areceptor for a protein encoded by the gene. Such an antibody can beraised against the selected protein by standard methods, and can beeither polyclonal or monoclonal, though monoclonal is preferred.Alternatively, the composition can comprise an antisense RNA that bindsan RNA encoded by the gene. Furthermore, the composition can comprise anucleic acid functionally encoding an antisense RNA that binds an RNAencoded by the gene. Other useful compositions will be readily apparentto the skilled artisan.

[0058] The present invention further provides a method of reducing orinhibiting a viral infection in a subject comprising mutating ex vivo ina selected cell from the subject an endogenous gene comprising thenucleic acid set forth in any of SEQ ID NO:1 through SEQ ID NO:75, or ahomolog thereof, to a gene form incapable of producing a functional geneproduct of the gene or a gene form producing a reduced amount of afunctional gene product of the gene, and replacing the cell in thesubject, thereby reducing viral infection of cells in the subject. Thecell can be selected according to the typical target cell of thespecific virus whose infection is to be reduced, prevented or inhibited.A preferred cell for several viruses is a hematopoietic cell. When theselected cell is a hematopoietic cell, viruses which can be reduced orinhibited from infection can include, for example, HIV, including HIV-1and HIV-2.

[0059] The present invention also provides a method of reducing orinhibiting a viral infection in a subject comprising mutating ex vivo ina selected cell from the subject an endogenous gene comprising a nucleicacid isolated by a method comprising (a) transferring into a cellculture growing in serum-containing medium a vector encoding a selectivemarker gene lacking a functional promoter, (b) selecting cellsexpressing the marker gene, (c) removing serum from the culture medium,(d) infecting the cell culture with the virus, and (e) isolating fromthe surviving cells a cellular gene within which the marker gene isinserted,

[0060] to a mutated gene form incapable of producing a functional geneproduct of the gene or to a mutated gene form producing a reduced amountof a functional gene product of the gene, and replacing the cell in thesubject, thereby reducing viral infection of cells in the subject. Thusthe mutated gene form can be one incapable of producing an effectiveamount of a functional protein or mRNA, or one incapable of producing afunctional protein or mRNA, for example. The method can be performedwherein the virus is HIV. The method can be performed in any selectedcell in which the virus may infect with deleterious results. Forexample, the cell can be a hematopoietic cell. However, many othervirus-cell combinations will be apparent to the skilled artisan.

[0061] The present invention additionally provides a method ofincreasing viral infection resistance in a subject comprising mutatingex vivo in a selected cell from the subject an endogenous genecomprising a nucleic acid isolated by a method comprising (a)transferring into a cell culture growing in serum-containing medium avector encoding a selective marker gene lacking a functional promoter,(b) selecting cells expressing the marker gene, (c) removing serum fromthe culture medium, (d) infecting the cell culture with the virus, and(e) isolating from the surviving cells a cellular gene within which themarker gene is inserted, to a mutated gene form incapable of producing afunctional gene product of the gene or a gene form producing a reducedamount of a functional gene product of the gene, and replacing the cellin the subject, thereby reducing viral infection of cells in thesubject. The virus can be HIV, particularly when the cell is ahematopoletic cell. However, many other virus-cell combinations will beapparent to the skilled artisan.

[0062] The present invention provides a method of identifying a cellulargene that can suppress a malignant phenotype in a cell, comprising (a)transferring into a cell culture incapable of growing well in soft agaror Matrigel a vector encoding a selective marker gene lacking afunctional promoter, (b) selecting cells expressing the marker gene, and(c) isolating from selected cells which are capable of growing in softagar or Matrigel a cellular gene within which the marker gene isinserted, thereby identifying a gene that can suppress a malignantphenotype in a cell. This method can be performed using any selectednon-transformed cell line, of which many are known in the art.

[0063] The present invention additionally provides a method ofidentifying a cellular gene that can suppress a malignant phenotype in acell, comprising (a) transferring into a cell culture of non-transformedcells a vector encoding a selective marker gene lacking a functionalpromoter, (b) selecting cells expressing the marker gene, and (c)isolating from selected and transformed cells a cellular gene withinwhich the marker gene is inserted, thereby identifying a gene that cansuppress a malignant phenotype in a cell. A non-transformed phenotypecan be determined by any of several standard methods in the art, such asthe exemplified inability to grow in soft agar, or inability to grow inMatrigel.

[0064] The present invention further provides a method of screening fora compound for suppressing a malignant phenotype in a cell comprisingadministering the compound to a cell containing a cellular genefunctionally encoding a gene product involved in establishment of amalignant phenotype in the cell and detecting the level of the geneproduct produced, a decrease or elimination of the gene productindicating a compound effective for suppressing the malignant phenotype.Detection of the level, or amount, of gene product produced can bemeasured, directly or indirectly, by any of several methods standard inthe art (e.g., protein gel, antibody-based assay, detecting labeled RNA)for assaying protein levels or amounts, and selected based upon thespecific gene product.

[0065] The present invention further provides a method of suppressing amalignant phenotype in a cell in a subject, comprising administering tothe subject an amount of a composition that inhibits expression orfunctioning of a gene product encoded by a gene comprising the nucleicacid set forth in SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ IDNO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82 or SEQ ID NO:83, or ahomolog thereof, thereby suppressing a malignant phenotype. Thecomposition can, for example, comprise an antibody that binds a proteinencoded by the gene. The composition can, as another example, comprisean antibody that binds a receptor for a protein encoded by the gene. Thecomposition can comprise an antisense RNA that binds an RNA encoded bythe gene. Further, the composition can comprise a nucleic acidfunctionally encoding an antisense RNA that binds an RNA encoded by thegene.

[0066] Diagnostic or therapeutic agents of the present invention can beadministered to a subject or an animal model by any of many standardmeans for administering therapeutics or diagnostics to that selectedsite or standard for administering that type of functional entity. Forexample, an agent can be administered orally, parenterally (e.g.,intravenously), by intramuscular injection, by intraperitonealinjection, topically, transdermally, or the like. Agents can beadministered, e.g., as a complex with cationic liposomes, orencapsulated in anionic liposomes. Compositions can include variousamounts of the selected agent in combination with a pharmaceuticallyacceptable carrier and, in addition, if desired, may include othermedicinal agents, pharmaceutical agents, carriers, adjuvants, diluents,etc. Parental administration, if used, is generally characterized byinjection. Injectables can be prepared in conventional forms, either asliquid solutions or suspensions, solid forms suitable for solution orsuspension in liquid prior to injection, or as emulsions. Depending uponthe mode of administration, the agent can be optimized to avoiddegradation in the subject, such as by encapsulation, etc.

[0067] Dosages will depend upon the mode of administration, the diseaseor condition to be treated, and the individual subject's condition, butwill be that dosage typical for and used in administration of antiviralor anticancer agents. Dosages will also depend upon the compositionbeing administered, e.g., a protein or a nucleic acid. Such dosages areknown in the art. Furthermore, the dosage can be adjusted according tothe typical dosage for the specific disease or condition to be treated.Furthermore, viral titers in culture cells of the target cell type canbe used to optimize the dosage for the target cells in vivo, andtransformation from varying dosages achieved in culture cells of thesame type as the target cell type can be monitored. Often a single dosecan be sufficient; however, the dose can be repeated if desirable. Thedosage should not be so large as to cause adverse side effects.Generally, the dosage will vary with the age, condition, sex and extentof the disease in the patient and can be determined by one of skill inthe art. The dosage can also be adjusted by the individual physician inthe event of any complication.

[0068] For administration to a cell in a subject, the composition, oncein the subject, will of course adjust to the subject's body temperature.For ex vivo administration, the composition can be administered by anystandard methods that would maintain viability of the cells, such as byadding it to culture medium (appropriate for the target cells) andadding this medium directly to the cells. As is known in the art, anymedium used in this method can be aqueous and non-toxic so as not torender the cells non-viable. In addition, it can contain standardnutrients for maintaining viability of cells, if desired. For in vivoadministration, the complex can be added to, for example, a blood sampleor a tissue sample from the patient, or to a pharmaceutically acceptablecarrier, e.g., saline and buffered saline, and administered by any ofseveral means known in the art. Examples of administration includeparenteral administration, e.g., by intravenous injection includingregional perfusion through a blood vessel supplying the tissues(s) ororgan(s) having the target cell(s), or by inhalation of an aerosol,subcutaneous or intramuscular injection, topical administration such asto skin wounds and lesions, direct transfection into, e.g., bone marrowcells prepared for transplantation and subsequent transplantation intothe subject, and direct transfection into an organ that is subsequentlytransplanted into the subject. Further administration methods includeoral administration, particularly when the composition is encapsulated,or rectal administration, particularly when the composition is insuppository form. A pharmaceutically acceptable carrier includes anymaterial that is not biologically or otherwise undesirable, i.e., thematerial may be administered to an individual along with the selectedcomplex without causing any undesirable biological effects orinteracting in a deleterious manner with any of the other components ofthe pharmaceutical composition in which it is contained.

[0069] Specifically, if a particular cell type in vivo is to betargeted, for example, by regional perfusion of an organ or tumor, cellsfrom the target tissue can be biopsied and optimal dosages for import ofthe complex into that tissue can be determined in vitro, as describedherein and as known in the art, to optimize the in vivo dosage,including concentration and time length. Alternatively, culture cells ofthe same cell type can also be used to optimize the dosage for thetarget cells in vivo.

[0070] For either ex vivo or in vivo use, the complex can beadministered at any effective concentration. An effective concentrationis that amount that results in reduction, inhibition or prevention ofthe viral infection or in reduction or inhibition of transformedphenotype of the cells

[0071] A nucleic acid can be administered in any of several means, whichcan be selected according to the vector utilized, the organ or tissue,if any, to be targeted, and the characteristics of the subject. Thenucleic acids, if desired in a pharmaceutically acceptable carrier suchas physiological saline, can be administered systemically, such asintravenously, intraarterially, orally, parenterally, subcutaneously.The nucleic acids can also be administered by direct injection into anorgan or by injection into the blood vessel supplying a target tissue.For an infection of cells of the lungs or trachea, it can beadministered intratracheally. The nucleic acids can additionally beadministered topically, transdermally, etc.

[0072] The nucleic acid or protein can be administered in a composition.For example, the composition can comprise other medicinal agents,pharmaceutical agents, carriers, adjuvants, diluents, etc. Furthermore,the composition can comprise, in addition to the vector, lipids such asliposomes, such as cationic liposomes (e.g., DOTMA, DOPE,DC-cholesterol) or anionic liposomes. Liposomes can further compriseproteins to facilitate targeting a particular cell, if desired.Administration of a composition comprising a vector and a cationicliposome can be administered to the blood afferent to a target organ orinhaled into the respiratory tract to target cells of the respiratorytract. Regarding liposomes, see, e.g., Brigham et al. Am. J. Resp. Cell.Mol. Biol. 1:95-100 (1989); Felgner et al. Proc. Natl. Acad. Sci USA84:7413-7417 (1987); U.S. Pat. No. 4,897,355.

[0073] For a viral vector comprising a nucleic acid, the composition cancomprise a pharmaceutically acceptable carrier such as phosphatebuffered saline or saline. The viral vector can be selected according tothe target cell, as known in the art. For example, adenoviral vectors,in particular replication-deficient adenoviral vectors, can be utilizedto target any of a number of cells, because of its broad host range.Many other viral vectors are available, and their target cells known.

EXAMPLES

[0074] Selective Elimination of Virally Infected Cells from a CellCulture

[0075] Rat intestinal cell line-1 cells (RIE-1 cells) were standardlygrown in Dulbecco's modified eagle's medium, high glucose, supplementedwith 10% fetal bovine serum. To begin the experiment, cells persistentlyinfected with reovirus were grown to near confluence, then serum wasremoved from the growth medium by removing the medium, washing the cellsin PBS, and returning to the flask medium not supplemented with serum.Typically, the serum content was reduced to 1% or less. The cells arestarved for serum for several days, or as long as about a month, tobring them to quiescence or growth arrest. Media containing 10% serum isthen added to the quiescent cells to stimulate growth of the cells.Surviving cells are found to not to be persistently infected cells byimmunohistochemical techniques used to establish whether cells containany infectious virus (sensitivity to 1 infectious virus per ml ofhomogenized cells).

[0076] Cellular Genomic DNA Isolation

[0077] Gene Trap Libraries: The libraries are generated by infecting theRIE-1 cells with a retrovirus vector (U3 gene-trap) at a ratio of lessthan one retrovirus for every ten cells. When a U3 gene trap retrovirusintegrates within an actively transcribed gene, the neomycin resistancegene that the U3 gene trap retrovirus encodes is also transcribed, thisconfers resistance to the cell to the antibiotic neomycin. Cells withgene trap events are able to survive exposure to neomycin while cellswithout a gene trap event die. The various cells that survive neomycinselection are then propagated as a library of gene trap events. Suchlibraries can be generated with any retrovirus vector that has theproperties of expressing a reporter gene from a transcriptionally activecellular promoter that tags the gene for later identification.

[0078] Reovirus selection: Reovirus infection is typically lethal toRIE-1 cells but can result in the development of persistently infectedcells. These cells continue to grow while producing infective reovirusparticles. For the identification of gene trap events that conferreovirus resistance to cells, the persistently infected cells must beeliminated or they will be scored as false positives. We have found thatRIE-1 cells persistently infected with reovirus are very poorly tolerantto serum starvation, passaging and plating at low density. Thus, we havedeveloped protocols for the screening of the RIE-1 gene trap librariesthat select against both reovirus sensitive cells and cells that arepersistently infected with reovirus.

[0079] 1. RIE-1 library cells are grown to near confluence and then theserum is removed from the media. The cells are starved for serum forseveral days to bring them to quiescent or growth arrest.

[0080] 2. The library cells are infected with reovirus at a titer ofgreater than ten reovirus per cell and the serum starvation is continuedfor several more days.

[0081] 3. The infected cells are passaged, (a process in which they areexposed to serum for three to six hours) and then starved for serum forseveral more days.

[0082] 4. The surviving cells are then allowed to grow in the presenceof serum until visible colonies develop at which point they are clonedby limiting dilution.

[0083] MEDIA: DULBECCO'S MODIFIED EAGLE'S MEDIUM, HIGH GLUCOSE(DME/HIGH) Hyclone Laboratories cat. no. SH30003.02.

[0084] NEOMYCIN: The antibiotic used to select against the cells thatdid not have a U3 gene trap retrovirus. We used GENETICIN, from Sigma.cat. no. G9516.

[0085] RAT INTESTINAL CELL LINE-1 CELLS (RIE-1 CELLS): These cells arefrom the laboratory of Dr. Ray Dubois (VAMC). They are typicallycultured in Dulbecco's Modified Eagle's Medium supplemented with 10%fetal calf serum.

[0086] REOVIRUS: Laboratory strains of either serotype 1 or serotype 3are used. They were originally obtained from the laboratories of BernardN. Fields (deceased). These viruses have been described in detail.

[0087] RETROVIRUS: The U3 gene trap retrovirus used here were developedby Dr. Earl Ruley (VAMC) and the libraries were produced using a generalprotocol suggested by him.

[0088] SERUM: FETAL BOVINE SERUM Hyclone Laboratories cat. no. A-1115-L.

[0089] Genes Necessary for Viral Infection

[0090] Characteristics of some of the isolated sequences include thefollowing:

[0091] SEQ ID NO:1—rat genomic sequence of vacuolar H+ATPase (chemicallyinhibiting the activity of the gene product results in resistance toinfluenza virus and reovirus)

[0092] SEQ ID NO:2—rat alpha tropomyosin genomic sequence

[0093] SEQ ID NO:3—rat genomic sequence of murine and rat gas5 gene(cell cycle regulated gene)

[0094] SEQ ID NO:4—rat genomic sequence of p162 of ras complex, mouse,human (cell cycle regulated gene)

[0095] SEQ ID NO:5—similar to N-acetyl-glucosaminyltransferase I mRNA,mouse, human (enzyme located in the Golgi region in the cell; has beenfound as part of a DNA containing virus)

[0096] SEQ ID NO:6—similar to calcyclin, mouse, human, reversecomplement (cell cycle regulated gene)

[0097] SEQ ID NO:7—contains sequence similar to :LOCUS AA254809 364 bpmRNA EST DEFINITION mz75a10.r1 Soares mouse lymph node NbMLN Musmusculus cDNA clone 719226 5′

[0098] SEQ ID NO:8—contains a sequence similar to No SW:RSP1_MOUSEQ01730 RSP-1 PROTEIN

[0099] SEQ ID NO:9—contains 5′ UTR of gbU25435HSU25435 Humantranscriptional repressor (CTCF) mRNA, complete cds, Length =3780

[0100] SEQ ID NO:38—similar to cDNA of retroviral origin

[0101] SEQ ID NO:50—trapped AYU-6 genetic element

[0102] Isolation of Cellular Genes that Suppress a Malignant Phenotype

[0103] We have utilized a gene-trap method of selecting cell lines thathave a transformed phenotype (are potentially tumor cells) from apopulation of cells (RIE-1 parentals) that are not transformed. Theparental cell line, RIE-1 cells, does not have the capacity to grow insoft agar or to produce tumors in mice. Following gene- trapping, cellswere screened for their capacity to grow in soft agar. These cells werecloned and genomic sequences were obtained 5′ or 3′ of the retrovirusvector (SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ IDNO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83). All of the cell linesbehave as if they are tumor cell lines, as they also induce tumors inmice.

[0104] Of the cell lines, two are associated with the enhancedexpression of the prostaglandin synthetase gene II or COX 2. The COX 2gene has been found to be increased in pre-malignant adenomas in humansand overexpressed in human colon cancer. Inhibitors of COX 2 expressionalso arrests the growth of the tumor. One of the cell lines, x18 (SEQ IDNO:76), has disrupted a gene that is now represented in the EST (dbest)database, but the gene is not known (not present in GenBank). (SEQ IDNO:76): >02-X18H-t7 . . . , identical to: gb|W55397|W55397 mb13h04.r1Life Tech mouse brain Mus at 1.0e-114. x18 has also been sequenced fromthe vector with the same EST being found. (SEQ ID NO:77): >x8_b4_(—)2 .. . (SEQ ID NO:78): >x7_b4 . . . (SEQ ID NO:79): >x4-b4 . . . (SEQ IDNO:80): >x2-b4 . . . (SEQ ID NO:81): >x15-b4 . . . (SEQ IDNO:82): >x13-re . . . , reverse complement. (SEQ ID NO:83): >x12_b4.

[0105] Each of the genes from which the provided nucleotide sequences isisolated represents a tumor suppressor gene. The mechanism by which thedisrupted genes other than the gene comprising the nucleic acid whichsequence is set forth in SEQ ID NO:76 may suppress a transformedphenotype is at present unknown. However, each one represents a tumorsuppressor gene that is potentially unique, as none of the genomicsequences correspond to a known gene. The capacity to select quicklytumor suppressor genes may provide unique targets in the process oftreating or preventing (potential for diagnostic testing) cancer.

[0106] Isolation of Entire Genomic Genes

[0107] An isolated nucleic acid of this invention (whose sequence is setforth in any of SEQ ID NO:1 through SEQ ID NO:83), or a smaller fragmentthereof, is labeled by a detectable label and utilized as a probe toscreen a rat genomic library (lambda phage or yeast artificialchromosome vector library) under high stringency conditions, i.e., highsalt and high temperatures to create hybridization and wash temperature5-20° C. Clones are isolated and sequenced by standard Sangerdideoxynucleotide sequencing methods. Once the entire sequence of thenew clone is determined, it is aligned with the probe sequence and itsorientation relative to the probe sequence determined. A second andthird probe is designed using sequences from either end of the combinedgenomic sequence, respectively. These probes are used to screen thelibrary, isolate new clones, which are sequenced. These sequences arealigned with the previously obtained sequences and new probes designedcorresponding to sequences at either end and the entire process repeateduntil the entire gene is isolated and mapped. When one end of thesequence cannot isolate any new clone, a new library can be screened.The complete sequence includes regulatory regions at the 5′ end and apolyadenylation signal at the 3′ end.

[0108] Isolation of cDNAs

[0109] An isolated nucleic acid (whose sequence is set forth in any ofSEQ ID NO:1 through SEQ ID NO:83, and preferably any of SEQ ID NO:5through SEQ ID NO:83), or a smaller fragment thereof, or additionalfragments obtained from the genomic library, that contain open readingframes, is labeled by a detectable label and utilized as a probe toscreen a portions of the present fragments, to screen a cDNA library. Arat cDNA library obtains rat cDNA; a human cDNA library obtains a humancDNA. Repeated screens can be utilized as described above to obtain thecomplete coding sequence of the gene from several clones if necessary.The isolates can then be sequenced to determine the nucleotide sequenceby standard means such as dideoxynucleotide sequencing methods.

[0110] Serum Survival Factor Isolation and Characterization

[0111] The lack of tolerance to serum starvation is due to the acquireddependence of the persistently infected cells for a serum factor(survival factor) that is present in serum. The serum survival factorfor persistently infected cells has a molecular weight between 50 and100 kD and resists inactivation in low pH (pH2) and chloroformextraction. It is inactivated by boiling for 5 minutes [oncefractionated from whole serum (50 to 100 kD fraction)], and in low ionicstrength solution [10 to 50 mM].

[0112] The factor was isolated from serum by size fraction usingcentriprep molecular cut-off filters with excluding sizes of 30 and 100kd (Millipore and Amnicon), and dialysis tubing with a molecularexclusion of 50 kd. Polyacrylamide gel electrophoresis and silverstaining was used to determine that all of the resulting material wasbetween 50 and 100 kd, confirming the validity of the initial isolation.Further purification was performed on using ion exchange chromatography,and heparin sulfate adsorption columns, followed by HPLC. Activity wasdetermined following adjusting the pH of the serum fraction (30 to 100kd fraction) to different pH conditions using HCl and readjusting the pHto pH 7.4 prior to assessment of biologic activity. Low ionic strengthsensitivity was determined by dialyzing the fraction containing activityinto low ionic strength solution for various lengths of time andreadjusting ionic strength to physiologic conditions prior todetermining biologic activity by dialyzing the fraction against themedia. The biologic activity was maintained in the aqueous solutionfollowing chloroform extraction, indicating the factor is not a lipid.The biologic activity was lost after the 30 to 100 kd fraction wasplaced in a 100° C. water bath for 5 minutes.

[0113] Isolated Nucleic Acids

[0114] Tagged genomic DIAS isolated were sequenced by standard methodsusing Sanger dideoxynucleotide sequencing. The nucleotide sequences ofthese nucleic acids are set forth herein as SEQ ID NO:1 through SEQ IDNO:75 (viral infection genes) and SEQ ID NO:76 through SEQ ID NO:83(tumor suppressor genes). The sequences were run through computerdatabanks in a homology search. Sequences for some of the “6b” sequences[obtained from genomic library 6, flask b] (i.e., SEQ ID NO:37, 38, 39,42, 61, 65, 66, 69) correspond to a known gene, alpha tropomyosin, andsome of the others correspond to the vacuolar-H⁺-ATPase. These sequencesare associated with both acute and persistent viral infection and thecellular genes which comprise them, i.e., alpha tropomyosin andvacuolar-H⁺-ATPase, can be targets for drug treatments for viralinfection using the methods described above. These genes can be therapytargets particularly because disruption of one or both alleles resultsin a viable cell.

1 83 828 base pairs nucleic acid double linear DNA (genomic) 1AAAAAAAAAT TACCATTTTT GGGNGAACCT TTNATANTTN GTTCCTAGAG GGNGAGTCAG 60GGGTAAAAAA AACGATNAAG GGAGTTGNGG CGATTGGAGA AGCTATTATG AAGGGATAAA 120ANACTTAGGT TGAGCCGGCG GGTGGGGTGT ATTCTTGGGG TGGNGAAAAG NNAGATCAAC 180ATGAGATTTT TTTGTTTTAG GTTTTGCATG TTGTAATGCA ATANTTTAAC CTGATTTTAT 240GTGCAGGATG CCTGAGGTTT GTGAGCAGGA ACACAGGAAA AGGAACACCG GTANTCGAAC 300ACCGGTGAGT CCGCGCAGCC GCAGAGAAGG CGGGTATCAT TCGNTCCACC CTGTATGNTA 360ATATGGAGCG CTACGGCCCC GCCCCTGGGG CCGATGGGCC CAAAAAGGTA GGGTTCGAGA 420AGACGTCTGC ATGGAGCAGT GGACCAGTGA AGACCCAGGC AAGGCCGAAC GTTGGGCCCC 480GGGCCCCGGG GGCGGGTAGC AGGGCCCATA CATTGTCCAA GGGCTGCTGG AGAGCCTGGA 540GCCTCGCTCC CCCACCGGCG CAAAGTGGTA CAGCCCATGG GGGCGTGGCC CATATCATGG 600ACGCGAGCGC GGCCGCCATC TTGNTCTGCG GTGCTGGTAT TTAGAGCGCA GCGCCTGACT 660GGCGGGGTCG CCTTCGCATC CGCCGCTTCG AGAATCTTCT TTCGTCTGCT CGCTCTCTCT 720CCCGTCGTCC TAGCCCGCCG CCGCCTGCTG AGCTTGCCCT CTTCCCCGCT TGCAGACATG 780GNGGACATTG AAAGACCCTA CCTNAAGGGC CNGCANGCNA GAAAAAGT 828 845 base pairsnucleic acid double linear DNA (genomic) 2 TCNCCTAAGA NANGAGANAGGTTAGATGGN AATGGAGANT ANATACCGGG CTTAGCTTCG 60 CCNNGGACCC ACCNAGGGGAAAAGAGCCNT CNNGCAACAA ACNAAAGGAN CGGAAAGAGG 120 AAGGGNANGN GGNNAAACANATTGGGCGAA TTTAAAANCT NNGNCCNGTT TGAAATAGNG 180 CNCGGCCGNT CCNTGGGCCNGATCCANCCT TCCNTNACTT TTCNTCCCCN GCNTTAAATT 240 GCGNCGNCGG CCCCCCCAACCATNTNTTCC GTTTTNANCA CCNGNGGCCC CGGCAGTGCN 300 GATGNNGGGG AATTGNNAATGCCCCCCANC CATTTTGNNT CNGNNCCTGG GGAGAGANTN 360 AAACGGTGNG NGNAGNNGTTAATATGGCGG CAGCGGNGAC ANCAGTAGCC AGNGCAGGCA 420 CGCGNAGTTG GCNGGGGACGCCANGTGNCN GGAGANNTGG AGCGGCGGCG GAGCGGGCNC 480 CNAAAAAAAA AAANAANNGNTGGTAAGGGG GCCCGGGGTG GANGANATTT CNNGGGCNGC 540 TTCTAGGNGT CANGNTGNGGCCGCTNCGTT CGGCCCTGGA TGNAGCCCNG NGCCNGTGCC 600 NCCNCCGGGG GGAGTTTGTTTCCNTCTACC GTNCCCTGCT GNGGAGCGAC GANCTGCANT 660 CCCCNGGAGC GTCTANNAGGCCGTGGCNAA CCCCATCNAN GCNCNCCAGT NAGCTTCCTT 720 CNTCCCGACA TAGTAGGCGTCNGGNGGCGT TGNCGACAGN GGCCNNCGTC GATGGGANNN 780 TCTATTTNNG NTTCATGGGCCGTATGTTAG ACCTNTCGAA GGACGCGNNA AATAGATAGG 840 GGGGG 845 818 base pairsnucleic acid double linear DNA (genomic) 3 TACACCTTTG NGNGTGTTGAAAATTACGGG GGANANGAAN AAAAANGTAT CCTTTTGGAN 60 GCCCCGGNCT CTTGTGGAATTTGTGATTTA CGGCGGNANT CATATGATTT CGGAAANAAG 120 ATAAAGCCNN NCNNNNNGGGGTAGGGAAGA AGGATTTTGN AAACAAANTN TGGGTNTATA 180 TAANNGTGGG GGGGGGAGNTCATTGAGGNG GGGNGGAATA TNNAATNTTT TTTTTTTNNT 240 TNNNNGGCAA GAGGGATGAAGGTAAGGTTA GTATGAAATG GCCNNNCCAG AGAAGTTNGA 300 TGAAAAAGAT AGTGCCACCAAGAGANATNA TTTGTTATTT TTAACAGTGG GGGGAGGTAG 360 TTNTAGACCA CCATTTATTANAACTGAGGC ACAAAGAAGA TGATTGGGGG GCACTTACAG 420 AGTAAGCAGT ATTTACATAAAGATTTNTTC CCCAGGAATN ANGAGGAAGN TGGATAACTG 480 AACAAAGCCA TGTAAGCAGGCTTTTTGGTA TGCATGTGGT CCCATTACAA GGAATACCCA 540 ATAAATAGCA AATGCACACTGCCATTCACA AGCAATTGCA GAGAATGGGT GGGGGATGTG 600 AAACTAAAGA GCTTTGTAGCTGCCTGAGGA GGTGGGTTCT CTATATCCGT GGGAGCTAGT 660 GATCCCCCAC AGGTCTTAGCTGGTGCCATG ATTGTGATCT TAGGCCAGAT TTGATGTCCC 720 CCACATGGCC GAGTCCGCCATGGATGCAAC AGGGCAGCTT TATTTGCTGT GGGCNGGTAN 780 TGAAGGATNT CACAAATGAACTTGGCAAGT AGAGAGGT 818 857 base pairs nucleic acid double linear DNA(genomic) 4 TGGAAAGANT GNGNTAAAGT TNAGTTNNNA GATATTGANN AANNTNGGGNAAAANAAGGT 60 GNNNNACAAT CTCNCAANNA TTTNAANGAA GGGGGAATAA ATGNAAANTGGGANTTAAAA 120 AAANAGGGGN NANANGNTTN NGGTTNAANA NAAGGGGGGT NTNCCCGTTTTTTTTTTAGG 180 ATCCTGGGAG TAACCNACAG GAACCNAAAA TTNGNANAAG GGNGNTCCTTCCCTTCCNGT 240 CAGTAAGGGA TGGGGCCCTA TTTTTANCAA CGAACACCAT TGACAGGANACCGGTCAGNA 300 TTCCGTTAAG TATTTTGACC TTTCCAGGGG ATGTNTCCGC ACAGCCGTTGNGACCTTAAA 360 CGCGNCCAGA TTNTGCGAAN GTCATTTTGG GAATGACTGT TGTAGACACTGCTTTTTTAG 420 TCGCAGATNT GACCGCAGAT TTTCNTTTCC CACCTTATGT CCGNTGGAGCAGTGGTGGCC 480 GGAGAAAATT TCTTGGGGTT CCNTCCCGNG ACCCAAAGAA CACAACTGTTCTCGCTGCCC 540 GGCACCCATC GCCACGTCAG CTCACGCTCG CGACGCCAGC ACGCNTGCGCGCAGAGAAAG 600 GCGGAGCATG CGCAAAGGCC TGCNTNTAAC ATCCGGGGCT CGGGCGGCGGCGCTGCCGCC 660 GCGAGGGATT AANGGGGTCT TTCNTTTCNG TCTCTGGCCG GCTGGGCGCGGGCGACTGCT 720 GGCGAGGCGC GTGGAAGCTC GCGATAGTTC CCCTCCGCCT CCTCTTCCCGGTCCAGGCCA 780 CTAGGGAGTT CGCTGACGCC GGGTGAACTG AGCGTACCGC CTGAAAGACCCCACAAGTAG 840 GTTTGGCAAG TAGAAAG 857 896 base pairs nucleic acid doublelinear DNA (genomic) 5 GGGAGAAAGG GGCGACNTTT ATTGGTCCNG GAGNGGGGGGNCAAATGGGT TTTTATCCAN 60 TTTAACGGGG GGAGGCCCCG GNNGAGGAAT TCCCGGGGGAGGAANAAAAA CAAGATCCGC 120 NTAAGAGGGN GGGGGTNTCC GNNNTTNTTN GAATNGTGGNGCACCGGGGG GGCAAGGAAG 180 AGGGTTCCCG GAGAATGGGG NGGATAAAAN GATTGGCAACTCACCCCGGN TAGTTGTACC 240 AGGTGTTTTT TTTTTTTTTT TTTGTTCANA AANAGGAAAATGATTCAAGT TAAAAAAGTA 300 ATTGGCAAGG AAATTTTTTT CCTANCCTCC TTGAAAAATAGTGGGAACAG GGGTTCCCAA 360 GGGGAAAGGT CCCCNATTNA ACAAAATGNG TTTCAGNGGAGTGTGGCCCA CCCATTGTGT 420 NTCCATGGAA GAGTGGCTTT TNTGGNGAAG TTCATTTTCCTTAACCTTNA NNACTGTAAN 480 GGNTCTTGTG CTTGAGAATA TTGTTGGCCA GCTTTATNGTCTTCATTTNT AANACTATTT 540 AGACTAGAGT GTTNTAGATT NTAGGTCTTC ANGTTTCCAGTCACCAGTCC TTGGCTTTTT 600 AGTATGGAAA TCACCAGTAA TGGCAATATA ACATCCCTGCTTCTGTTTCT TAGAAGGCTN 660 NATTACAGTG TGTTCAAACT CCGTGTCATT GCAACAGGTTAAACTAACTT TNTACGTAGG 720 ACATCAGGGT ATTGACATTC TCATCCTAAA GTCAGTTTGTCTGTTTCCAG AGGAGGAACT 780 GAAGCAGTGG TTCTTTAAGT AACTGACTCA GGGCTTTCCTGCCTGGCGCG CCTGCCAGGC 840 ATNGTGTAGC ATTGTACTGC ATCTTCTTTG ACCAGTTTCCCCAGGTGAAG AGCCTG 896 937 base pairs nucleic acid double linear DNA(genomic) 6 GGGCCCCCCC CCCCCNANTT AATTTTNGGG AAGAAAAAAG GGAAAAAANTTTGGGGTCAG 60 GAAAAANGAA GTTGGNAANC GNNGGGGNGN CAGNATTNGA ANAGTGGGGGANNTTAATTT 120 NAGAGGTCCC TTNNTTCCNN GGAAAAGTTT AAAAGGGGTT CAATTAACTTNGGATCNCCA 180 TTTATCAGAT TACCCGNGNG TCACCTGGGG ACCCTTTACN GGTGGCGGGACATTNGAAAN 240 ACATATTAGT CAGATTATAC ATAGCAAANA TAGTTAGGAG CACAANGAATCATTTATGGT 300 GGNGGTCACC ACACAGGAGA TGTATTATCC GCAGTATTAG AGAGTTGAGAACCATATNTT 360 AGAGATGCGG TAGACTGACT GTTCCCTTTT CGNTTGGAGT GACCTTGCCATTAGAGGCAA 420 CAGCATCAGT ATTGTTCCCA GTCCCCNTCA CACTGATTCG AACTTTAAGGACACTGATCT 480 NTGGCTGGTA GAGGTTCAGC ACACATACCA GAGTTACGAG TCACGTGCCAGAAGGGCAAA 540 CTGAACACGG AATTAGAGGG AACTCGATGT CTCCGGCTTG CACTGGTCTTCTCTTGCANT 600 AGAATCCTTC ATCCTGCTCC CAGTCCGGAC GTCCAGGCAA CAAGGGCGTGGAAAGTGAGG 660 GGGCTGGGAG GTGTGTTTGC CTTGCCTCAG GCGNTGGGTG GGGTTGGGGCGTGCCAGCAC 720 TCCCCTGGGC GGGCNTCACC GATGCTGGCC ACTATAAGGC CAGCCAGACTGCGACACAGT 780 CCATCCCCTC GACCACTCTT TTGGCGCTTC ATTGTCGACG TGTGGTGAGCTCTCACTGGG 840 GCGTCCCTCT AAGATCTGTC CACTNCCTGG TCTAGGGGTT AAGCNTTTTCCTGCCCTGAA 900 AGACCCCACA ATGTAGNTTT GGCAAGCTAG CAAAGGT 937 888 basepairs nucleic acid double linear DNA (genomic) 7 AAAAGGGGGC CCCAGCGGNGGGGGGTTGTC CAAGGAATCA AAANGTGGGG NGGGGGGGAA 60 AAAANTACTT TTAAAAAAGGCNGCCNNANA ATANANGACG TTCNGGGGNG TTTGAAAAAA 120 GGCCGGAAGC CTCGGACNGGTTTCNNTGTT AGGACAAGGA AAAAGGGNAC GCACNGGGAT 180 TTCCTTTCCT TATNTTAGCAAATNGCCGGC CAGGAAACCA NCGAGTTGGG NGGGNTTNGG 240 TTTTCNGTNA AAGGAAAGCAGGGGGGGGAN AAACACGGAN AAAAAGGGAA GAANNGGGTT 300 NATTNNGGTT AGNAATTGGNTCCCAGAGAG NGCCAAGAAA ATNGGCCTGT CCAAAATTCT 360 TTTTCCCNGC TTTTAAGACAGGCANGATAN TATNNGGCAG CAGGTNATTA CCANAGGTAA 420 GTAAATTACA ATGGGTAAGGGCTTGGCACA GGCCAGGGTA AGTAGGGCAN GTATGGATGT 480 TAAACATTAC CCTTCATCCNGAGGNAGTTA ACACAAGCAT TCNTGGCGGG TCTCACATAT 540 CCCAAANAAA AATNTTCAAAAGNAGCCCCN TGGGGAACGT TAAGCCAAGC NTANGACTCA 600 CAAGGGANGA CATGGGCAGGNTAGGGNACA GAATCAGTGN TCAGAGACTC CAGGGGCACC 660 CCTGATTCCN TTTGNTGTCACACAGACANT GCTCCAGGGA CAACCTTCCC GGANGTGAGT 720 ATANGACTTT CCTGATGGNGACGCTGCCGT GANGGGACAC TNCCTCGTGG TAGCACACAT 780 TCCTCAGTCA GCTTCTGAGCCTCAGGGTCC CAGCAGGCAC AGTGGCAANG ACCTCATTCT 840 TCTCGTCTGT CCCACTGAAAGACNNTCACN AAGGAGCTGG CTAGTAGA 888 980 base pairs nucleic acid doublelinear DNA (genomic) 8 AGAAATGAAA AAGAAGGAAA GCTAAAAATA GATTATAAGTGTTCTATTTG AAAAAAGAAA 60 GAAAAAAAAG AAAAAGAACA CAGAGAAGAA TAAAGGAGAAGAAAAAGGAA GAGAAAAAAA 120 AGAAAGAAAA AACGGAAAAG AAACCTAGAA AATAAAAAAACAAAGTATCC GATAAGGAAG 180 AGAAAGGAGA AAGACTTACC TAGAGCCCAG AAATAGAGAAACTAGAACAA AAAATGGAGA 240 AGAAGAGGAG AGAAAAAGGA TTAGAGAGGG TGAGGTAGAAGGAAGAAAAG ACAAGAAAGC 300 AGAAAAAAAC TAACAAAGAT GCATATAAAC AGAGAGAAGATGATTAAGAT TAGAGAAAAA 360 GACCAAAGAG AGAAGGTAGA CAGGACAAAT AAAACAAAAACAGGAGGGGA GAAGGGGAAA 420 GAAGAAAGAG GGCAAAAGCA AAGGAATAAG ATAATAGCACCAATAGCAGG ACAGTAAAGG 480 GTAGAGAAGG GACCATTCCC TACCCCATAG GGGGGAACGACCCCGGAATC AAAATACAAG 540 GCACCGAGCT GAACCTGGTT ATCACACAGG CAGGAGTGGTATAGCACGGC GTTCCGGGCA 600 AAAAAAAAAA TGAAAAATAA ATTCCTTCGG GCGGAGAACTAGAAGAGGAT GGGAACTCCT 660 TGACAGAAGT AGCAGGCAGG AAGCCAGCCA GCACCCCAGCCCAAACAGAA GCAGCCGCAA 720 TGAAACGGGC GGCAGATCCA CATCCGCAAA GTCCTCAAGGGAGCATCGGC GAGGCCCGGA 780 GCCAATGAGG AAGGGCAGGA AACCATATCA AGCCGAGCGTCGGGACGGCT GCCATGAGAC 840 ACCCGGAGAG GTAATTTTTT TTTTACGGGA AGCGTCCAGCCAAGTTAGTG GGCCGGAAGC 900 GACGGTACTT TAGTATACAT CGTTTTGCCC GAGTGGTCAGATTCTTTTGT TATCCCCAAC 960 AGAACCGTAA GCTAGAAATA 980 845 base pairsnucleic acid double linear DNA (genomic) 9 TCNCCTAAGA NANGAGANAGGTTAGATGGN AATGGAGANT ANATACCGGG CTTAGCTTCG 60 CCNNGGACCC ACCNAGGGGAAAAGAGCCNT CNNGCAACAA ACNAAAGGAN CGGAAAGAGG 120 AAGGGNANGN GGNNAAACANATTGGGCGAA TTTAAAANCT NNGNCCNGTT TGAAATAGNG 180 CNCGGCCGNT CCNTGGGCCNGATCCANCCT TCCNTNACTT TTCNTCCCCN GCNTTAAATT 240 GCGNCGNCGG CCCCCCCAACCATNTNTTCC GTTTTNANCA CCNGNGGCCC CGGCAGTGCN 300 GATGNNGGGG AATTGNNAATGCCCCCCANC CATTTTGNNT CNGNNCCTGG GGAGAGANTN 360 AAACGGTGNG NGNAGNNGTTAATATGGCGG CAGCGGNGAC ANCAGTAGCC AGNGCAGGCA 420 CGCGNAGTTG GCNGGGGACGCCANGTGNCN GGAGANNTGG AGCGGCGGCG GAGCGGGCNC 480 CNAAAAAAAA AAANAANNGNTGGTAAGGGG GCCCGGGGTG GANGANATTT CNNGGGCNGC 540 TTCTAGGNGT CANGNTGNGGCCGCTNCGTT CGGCCCTGGA TGNAGCCCNG NGCCNGTGCC 600 NCCNCCGGGG GGAGTTTGTTTCCNTCTACC GTNCCCTGCT GNGGAGCGAC GANCTGCANT 660 CCCCNGGAGC GTCTANNAGGCCGTGGCNAA CCCCATCNAN GCNCNCCAGT NAGCTTCCTT 720 CNTCCCGACA TAGTAGGCGTCNGGNGGCGT TGNCGACAGN GGCCNNCGTC GATGGGANNN 780 TCTATTTNNG NTTCATGGGCCGTATGTTAG ACCTNTCGAA GGACGCGNNA AATAGATAGG 840 GGGGG 845 528 base pairsnucleic acid double linear DNA (genomic) 10 GGATTTNNTA ACCTTTCNGGGAAGGGNGNG GAAAAGGNGC CAAACAAAAA GACCCCNNTG 60 CCCGGAAATN CTTGGGGGNNATTGNGGAGC GTTTTTTANN GGGGATTGGG GGGNTNGGGN 120 TGCNCCCNNA TATTCCCGGCTNAGGGGCAA CCCGAGGGGT NNTNTCCGAC CATGTAACTT 180 GTTTCGGAAT GAGGGGGAATGCNNATTNTG ANTATTGAAN NGNGACCCGG NGGGGNCNTG 240 TTNNAATTAA CCTNNTACCCGGAATTTCNG CGAGANCGNG ANGATNNCTG GCACTTNTTC 300 CGTATTACGN GTGGCGTTCNNGANTGCAGG GGNTGCCCTT GTTTGNNTTT CTGAGGGTTT 360 CTTATANGCA GATTGTGGGGTTGGAAACGA GANATCCCTN ANGTAATGCC ANNTCACACG 420 GGATGGAGCA GGAACNCCCTACGNATAGTT NACCTTCANT CAGGGTGGGG AANCGATNGA 480 CCNGAGGTAT ATGGGCNGAACNGGACATGT NGGGNNANCC GTTCAATC 528 927 base pairs nucleic acid doublelinear DNA (genomic) 11 AANACGGTTT AATAAGGGGG ATGTTCAAAA CNCCACTCCGGGGGAANAAA ANAAAAAATT 60 AGGGGGGGAG AANGGATTGG NGTATAGTTT CCCACCACAAACCTNGTTCC ATTTTTTCGG 120 GGGGGNAACG GAGGNCATGA TTATGGGGTG AAGGCAGCACCCACCCATTT TTCGGGGGNA 180 AGTCAGTTTT TTTTGGTANA ATCAAAGTTC CTTCGAACATNTCGTTTTAT CCAAGGAGTT 240 TTGGTGTTAA ATTAGCANTT TNTGNGAGTT TCAAAGTTNTGGTTCCNGAG NAGNTTTGTA 300 ATTGGTTCAC CGGTTNTTTT GNGCCAGGAA AGCAGACCCNTGTTNGGAGG GGAGATTCCN 360 ATTTTTAGTT CCCATTTGGT GTTTCCNTAG GTAATGGAGTCTGCAGACAG TTTGAGTNTA 420 NTGAGTTGAG TCCCTTNTCC TATCAGCCGG GGTGGCATTCTGTCCAAAGG AGGAATCCAG 480 CAGCCAGATT AGATTTCAGT NTCNTTTNTA ACAGGGAAGTTAGACACACC CGGCCAGNTT 540 GCAGCCTTTC CACCCCCAAN GAGTGAACCC TGCCNTTTCAGCTTTTACCC AATTTACTTT 600 CGTTGGCTTA GCATGCAGAT TNTTTGGCTC CATGCCCGGAGCAGCTGACA TGGGAGGCTT 660 TGAAACTTCC ATTATCATAG AATGGCAGGC AGGTCCTTTGCGGTTAAAAC CAGGAGCCTG 720 GGCCNAATGA GATGGNTCAN TGAGCAAAGG CGNTTACTGCCAACCCTGAT GCCTTCAGTT 780 TAGTNTTGGA ATTCACAGGG TAGAAGTTGA ANACNTTTGACTCTTCAAAA GTTGTCCCTG 840 TAGCAGGGCA GNNGTGGTGC ATNCCTTTAA TTTGGGCTACTTTGTGAAAG ATATCCACAA 900 NGAACCTTGG CAAGTAGAGG ANGTCGT 927 911 basepairs nucleic acid double linear DNA (genomic) 12 GGGAGTTTGC TCTCAGAGNGCCNATTACGC NACAGGGGGN GTCTCACANT ATAANCTCAT 60 ATANNATACT CTACNNTNCCCCCCCTNANG TNTCAAGGGC AAGAGAATAT NNTCTCTCTC 120 NTATCGTCTN GGGGNNTCTNAAATGTTTGN GCTCCCCGGG NAAAATANNT CTCTNTCNCG 180 NCTCTATNTT CTCNCCTCACATATNTGCGN ACTCTTTCTC NNCCACANNA AAAGCGCCCA 240 GTGNGGGGAN CTCNNAGAGTGTATNGNGAA GAACTGNNAG TGTNTNTGGG GCGCGTTCTC 300 GGGGAGANNA TACNCTTCTCTCNTCTCTCT NTAGAGTGNG ATGTANAAAA CCNCANNTGT 360 TGCANAGANA AATGGGGCTCNGAGNCTCTT ATATTTCCCC NCCCCTCTCN CCATATATNA 420 CCTNCGGGGG CTTNTNTNTAAATCNCCTNT CNCCATTNTT NNNANNNGCG TGTTTNTATT 480 GTNNGTNTCC NCNTGNTCCAAAAATCTCAA ATTTGTGTCT CTTNTCCCAA ACNCTATNTC 540 TCCCNTANCC CTGGGGGNGTNTATTATNTN TNTNTATATN CNTATNTTAT ATACNTATAN 600 TNTATNTNNT ATATATTTGGGGTCNTTACC AAAACCCCNT TTTTNTCTCA CTTTTCNTCN 660 ACTCCCTTCC CGGGGCCTNGAAANTTTATT NCCNNCCNTT NNGNTCCTTT TCTNTTAAAT 720 TCNTTNCNTN NGGAAAACCCTTTTCNAAAC NGGNTTTCCC CTTTTNNCNT CCCNCTCAAA 780 CCCCCCAAAT TNGGGCATTTTTTCTTTTCC CCTCACCNAA CCCCNTTTNC CTCCCCCCNC 840 CCCCCCCAAA NTGNGAATACCCTGNTTTTC AGNGGNNNNG AAAAATCCCT CCCCGANGGN 900 GCCCCCCTCC T 911 880base pairs nucleic acid double linear DNA (genomic) 13 GGGCACCAACGGNGGAAGAG TTTTCCANGG TANAAGAAAG NAGGANTGGG NCGANAANAA 60 TTANTTTTNAAAAAGGNCAC CAGATANAAA AAACTTTTNA GGGGNGTTAA NAAAAANGCN 120 GAAACCCTCNGACGGTTTTC NNGANTNTTA AANAGATTCA GGGGAAGCAC GAGATTATCT 180 TTTCNTTTTTGAGCAAATTG CCAGCAGGGA ACNGACNAGA GGNTNGGTTT TTGNATNCNN 240 TTAAACGTAACGCAGNTTTG GANAAACACA GNTNACATGG AAAGACCTGG GNNATTAGGG 300 TAANGNAAGNGGTTCAAGAG AGAGCCGATG AAATNGCCNG GTCCAAAATC TTTTTCCTTG 360 NCTTTAANACAGGTNNNAAA AATNNGGCTG CTGTTTATAA CNATAGNTAA GTGAANNACA 420 ANGGGTAAGTGNTTGGCACA GNCCAGGGTA AGTAGGCATN NAAGGAATGT TAAACATNAC 480 CNTTGATCGNGNGGTTGTTT ACACCGCNTT AAAGAAANGT TTAAAAATAT CCCTGGGCTG 540 TTTCTTCCTNGGTGCCNCAN GGNGAACGAC AAGCCAAGCG NATGANTCAC AGGAGACGAC 600 ATGGGCAGGTTGGGTACAGA ATCAGTGTTC AGAGACTCCA GGGGCACCCA GATTCCNTCA 660 GNCTGTCACACAGACACTGC TCCCAGGGAC AACCCTCCGG GATGTGAGGN NANGACTTCC 720 GNGNNGGAGACGCTNCAGNG ANGGGACACT CCTGGTGGTA GCACACATTC TTCAGTCNGA 780 TTNTGAGCNTCTGGTCCCNG CAGAGNACAG TGGNAATGAC TTTTTTCTTA CTTGNGNCTC 840 CAAGGGCGTCTCCACAAGAC AGCGTGNCNA GTAGATAAGT 880 923 base pairs nucleic acid doublelinear DNA (genomic) 14 GGGAGGAGTA CNGGANGGGT CCGACGTAAN TNTNTCACAGGNAAGNCGAN ANGAGGAGGG 60 GTNGCGTAGG NNACAAAGAG ATAGGAACGG GGNCGNNAACNTNNCNTNTN GAAAAGGCCG 120 CCANNGTNAA NCAACTNTGG CGGGGGTGGG ACNNAAGGCGNGNGGCNNNA GAAGGTTTNN 180 TTNNTTGNAA CCNAGATTCG AGGGACGGAC NGGANTATCNTATCCNTNTT NGTTNCGANT 240 GCCNGCGNGN ATCNGGCNAG GGAGGGTNGG TTNNNNGGTTTCNGGNGACN NCCCCAGTTT 300 NTGGNNNATA CCCNGCTCTC ACANGNNGGA CGNGGGTNTTTNNGGTGAGG AAGNNGCNTC 360 CCCGCGAGAG CCCGNGGNAA GGGCGNGTCC AAAANTCTTNTTCCCTGCTT NTNCNACAGG 420 CTNNGANANN ATNNGGCTGN TGTTNATCNC NATAGGTAGNTCAACCNNCA NGGGGANGTG 480 CTNNCACACC CCAGGTTAGT GTCCCNTNCA NGGTATGTTAANACGTTACC NNTGATCGGG 540 GGTTNTTTAC NNAAAANNAA AAAAAAANTC ACCNTCCCGGGCNTGNTGNT TCCTNGGGGC 600 CCCANGGTGA ACGACNANCC AANCTNTTGA NTNACAAGGGACGACGTGNG CAGGTTGNCG 660 TNCNGAGTCA GTGTTCAGAG ANTTCNGGGG CACCCCTGATTCCCNCGGNN GTNACACAGA 720 NACTGNTCCA GGNNCNNCCC TCCGGTTGNG AGTCNAAGACTTCNGGNNGG TGACNCTACN 780 GTGANNGGAC ACTTCGTGGN GGTGNCNCAC ATTCGTCGGTCGGCTTANGA NCNTCTNGGT 840 CCCNGCAGAG CACTNTNGCA ATGNCTTTNT TTGTTCTGGGGCTTCCNAAT GGGTCCTCCC 900 AAAAGNCNGC TTTAGCTGTA ATA 923 880 base pairsnucleic acid double linear DNA (genomic) 15 ANANAGAGTA ANTAANANAAGAGGAAGAGA NAAGAAAGNA GAAGGNAAGG ANANAAANGG 60 GNNGGCGAGG AAAAAAGGAAAGGAGAANAA TAAAAGAAAA AGTGAGGAAG GAAGGAGTAN 120 NAGAAAAAAG NAAAGNGGAGATAGNAGAAA GGNCCGGNGG ANAAAAGANT AGATTAANGA 180 NAGNTGAAAG AATAAAGANNANGGCGANAA GGAAAGAAGA NCGAGNATTA GAAANAAGAG 240 AGGAAAGANN NGGGGGGAGGGAANGAGGCG AANTCNNGAG ANCAGTNNAN AAGGCAAGAG 300 AATNAGGAGN AGANANGAAGNNNANGANGA AGGAGGGGAA AGAGGGNACA GAAAAAACAA 360 GTANAGTAAC CNACNNCNGCGAGNGNGCCA AATAGGTNGC GCCAGCNACA NGGCCCGAGC 420 CCNGGGCGAG GGGGCATCANGAGCCAAGGG GAGCGGGTCC AGNCNTAGTT NTGAAAGGAA 480 AGGGGAGGNG GGNAGATATTATATGGTCGN GCCCCCCCCN GTGTCTCGGT GAAAAAAAAA 540 AGGNGTGANN AGCAGGGCCNTNTTGGNTGN GGGATCGNGC ATGATCAGAG ACCNGAGGCC 600 GGACNTTCCG CNGNGCCTTCCGTAGGCCCA NTGTCAAATG TATTCAAGCC GGTTNGAAGG 660 ATGCCGGNGN TAGNGANTGATGCGGGGGCC NGCCCCCCCG GNTTTCCGCC CCCGCAGCCN 720 CNGTGGCCGC CATNACGGAGTTCCCAGTGG TGAGNGTGCG GAGNTGAGGC CCCGCGGGTC 780 GCCGCCGGTC CCCGCAGACAGGAACGCGGA GCGNNCCCTG CGCTNGAACG TANGGGNCCA 840 CTTGAAAGAC TNNACNAAANGACGCNGATT TGTAGAAAAG 880 166 base pairs nucleic acid double linear DNA(genomic) 16 ATTCTTCAGC TTTTGCNTAG AGGAAAAAGA ATGGATTGTT TCTAGGACAACCTGCTGAGG 60 TGCTCACCNA GNGTTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTCTCTCTCTCTC 120 TNTGNCTCTC TCCTGAANNT CCCCANAGGN NCTTNGCAGN AAAANG 166162 base pairs nucleic acid double linear DNA (genomic) 17 CNTTTTNCTGCNAAGNNCCT NTGGGGANNT TCAGGAGAGA GNCANAGAGA GAGAGAGAGA 60 GAGAGAGAGAGAGAGAGAGA GAGAGAGAGA GAACNCTNGG TGAGCACCTC AGCAGGTTGT 120 CCTAGAAACAATCCATTCTT TTTCCTCTAN GCAAAAGCTG AA 162 871 base pairs nucleic aciddouble linear DNA (genomic) 18 GAATAAAACC CCAGAAAGGT TTTAAAACATTCCGTATAGA AGTTGATNAA TTNAAATAAT 60 TGGAGGTGAA ATACACAGAG GGTTTTTCAATTAATCAATA AAAAAATAAA TTACNTACNT 120 NTTTTGGGGG GTTTTATGNA NAAANGAATTGGAGGGATCA ATTTGCAAGA AATTTATTTT 180 TTNGTATTAT TTAAAAACCG TTANGGATTCNGTTGATTTT AAATCAAGCA GTAAATATAT 240 TAAAAGGTAG GAGAATGGTA TCAATAGGCCAAGATAACAG AGTGTAAAAG TTAAAAGTAT 300 TGGACAGAAA TATTAAGAGT TATTGTTAAGATCCNGGACT TTGGAAAATT TAAAACCAAG 360 CGATTTAGGC CAAGTTATTT CCACAGTATGGTATCAGAAG GAGTAAAGAG ACAGCACAGG 420 TGCAGATNTG ACGGCTTGGT TCCTTAGGTTATTGCCACAG CAACGGTCTT GGCCGCAAGG 480 CAGGCTTGGG CCCAGCATGA GAAGAGAGGGGGAACCAAGT TCTTCAGGGA CCNGACGGGC 540 GGCGCCGGTG AGAAAGGACT TCATCTTGCCATGNTCANTC AGCGAAACTG CAAACGCTTN 600 TGGCAGAGAC AACGCCAGAT CTGCAGAGGCATTCCGGCCT TTAACCGCTT TCCCACAGTC 660 GGCCCACAGG CCTTACCGCA GCAGAAAGCGCGCGACCCGG AGGTCCCGCC AGTCAAAAGA 720 AAAAGGGGGG CGCAAAACCA TATAAGGCNTGGAGCAGGCG GCCCGGCCCC GCCCCCAGGA 780 CATGGGCCCG GCCCCAATCA TGCCCCGCCCCCAGGATTCG GTCCCGCCTC CTCCCGCTCC 840 CGGGATGGGC CGTTATGCTC CCGATACGCA T871 936 base pairs nucleic acid double linear DNA (genomic) 19TGGGATTCAA AAATTGGAAG TTANTTTTTN AGGAAATTTN TTTTTAAAAT TNTAATTGGG 60GGGNNTNGCC ACCAATTAAA ANGNGTTTGA ATTNAAAANG ATTGCCGGGG GAAAAANCCA 120TTTNCTGCAN GGAATTAACC AAGTAATTTG GNTTGGNAGC ACTNGTTTTG GGCCTNTAAA 180AGGCATTTTA AANACAAATT AACAGGGCNG GCATNTTCAA CGGGNGNTAG NTTGTTTTNA 240TGAAACNGAG GNTTTTGGGG GCGGGCCTTT CCNATTNGTT TCCTTTTTTA GGATTAACAG 300ATGNGAAAAA AAATNATGGT TTTATATCAT CGTTNTTGGC ATCAGCAGAT TGGCNATTCA 360ATTAAAACAG ATCATTCATG ATNGGCTTTT TGGCCATTAC CATGNAAACA CAAAGAGCCA 420GGGTTTGATT GCCCTGACCC GCCNACCTTC GGTTGCTTAG GTGAGGTGCA GCACTGCGTT 480TTTCCTTTTC GGACTGAAAA CAGGCGAATG AATCATTTCN GTCGTGTCTT GAGGGTGCAT 540TTTTNACATT TTTGTGCCNT GCTGTGCGCC GGTGTGTGAT TTCCCTGTTT TAAGTGGCCC 600CTGAGGATAA CAGTGAAGTG CTGTCTAGCA TTCTTCTGCG CAGGAAGGCG GAGATCTGCC 660CTGCGGAGAA AGTATGCGTG CTGGATAAGC ATTACTGAGC ATGACACAGA GCACCGTTGA 720CCCCGAGTGC AGCGTTAGTG AACCGGCCAA TGTGCTGGGG GATTTTAAAT GGAATCACAC 780AGAAGCTGAG GCTGAGGATT GATCTGTGAG TAACAAGTTG TGAATGAGGC TGGCAGGAGC 840TAGCCTGGGA GTAAGATTCA GTGTTTGNTA ACAGCGTGCA GGCATTAAGC CAGGGAACTG 900AAAGTNCCCA CANNGNCTTT GGCAAGTAAG AAGTCG 936 888 base pairs nucleic aciddouble linear DNA (genomic) 20 AGGNNGGGGG GGGAAACTTN TTTATNTGGAAAANTTTTGT TTNGGCGGGN AAGGAGTTTT 60 TAANAANGTT AANGGAAAAA GCTTTTANTTAANATGACCT TTTTGGGGGA AANACAAANT 120 TGGTNNGTGT ATTNGNGAAA AAGATTTATTATAAGATTTT TTATAANATT TTNGGGGGGG 180 AAATATTTCA AANAAAATTC TGTAACAAAAGGNTTTTTGT TTTTTGTTNT CCAAGNAGTT 240 NTCCAGGTAG TTNTCAACAA CNNANGCCNTAGGGAAGGAC ATCATATGGA TATTTTCANA 300 GATTTGTTTT TAGGAAACAT TNTAAAGTCAAGGTTAAGAT GACAGTCAAN TCCCANGAGN 360 GNGGTAACTG TNTGCTTCTT TATTTAAAATTCAATATTCA GGATTTCATT TATACTAACA 420 AGANTAATTA CCATCTTAAT GAAACATAATTTGAATAATT TGCAAACAAT NTGATTTTTC 480 TTGAATATAC ATGTTACTAA AATATTANGGATGCAAATAG NTAATAAACA AATAGATANG 540 NAACCATGGN ACACCCCTTC TGTGATTGGNGGGACNTGGG CATAAGGCTT GTTTGTATAA 600 TAATGTTCAT ATTTTACATT CTTCCTNNGAGGANGGTCCT CCCTGTTAAG AAAANGACTC 660 CAGGATAAGG AGACAGCACC AGTNTAGGAAGTGAGGNTCT GTTTAATGTC TTAGCAAAGT 720 AGTAAATGNT GGGACCATCA GAATAGCCCNTAAGGNTGTG GANAGAACTC TAAAAGCNTG 780 ATATATATAT ATATATATAT ATATATATATATATATATAT ATATATNTAT ATAAAGAGGC 840 AGTATTGAAA GACNTNCACC AATNGAGCTGGCNAGCTAGA AGAGGTCG 888 903 base pairs nucleic acid double linear DNA(genomic) 21 CTTGGAAGGT TTTTTTNNCA AAANCCNGGG NGGGTTTTTT TTAANAAANAGGNGAAAAGA 60 TTTGGAAACT TTTTTTTTTG GTTGAAGTTA NTTGGGGATT GGGGGAAAAATTAAAAGGAT 120 TCAAAGTTCC CATGGNTTGG AAGTANAACT TTTATTCAGA AGNGAAAGTTTTAATAATGA 180 AANATGTTTT TTTGGATTNA CGGNGGNGGA ATTGGGGAGN GGAGAGAGAAGAGAGAGAGA 240 GAGGGAGAGA GAGCCGGATC CGCANTCGGG GGTTTCTACC GGCAGAGCCAGGACGGAGAG 300 GGTTTTCGGC AGCCGCNGCG GGTTCGGAGN TTTTAAGGTT TNTTAATCTTGGAAGGTGTC 360 TGANATNACC CCGTTTCTTG TCGGTGATGT TTNGTACAAG CTTTCATTTCTTCAGGATTT 420 CGGAGCGCCA ATTACTGCCC CGATNTGGTG TTTATGTTTG CCCGTTCNTGCGCNTGGCCC 480 CGCGCCCGCC CGNGAGCTGC GTTTTCCCTG GCCGCGCGGC CCGAGGGGGTGGGTGGGGGG 540 CCTTGGCCCG CGCACCCCAG CGCAAGGGAG GGGTCCCCTT CATTTTTTTTCATTGACTTC 600 AGCACCATGT GATCAGGAAG TCTGGCTCCN TCCATTTCCC NTCCCGACTGAAGGGAAACA 660 TTGTGTAGCA GCCCGCCGCG GCCACTGGTG GGATGGCNTT CGCTGGCCTGANGTAGGGGG 720 ATAAAAATAA CCGGCATATT TAAGGCCGGA GCAGGAATCC CGGCGCTCACACGCGGCCTG 780 GTCAGTTCCC GAAGCCGCCA GCAGCGCTCT GCGCAGCGAG CTGCTGCTGCGCCAGCCAGN 840 TCGGGAGTGC GGACACCGTG AAAGACCTTC ACCTATAGNG CNTGGCAAGCTAGAAGAGGT 900 CGT 903 918 base pairs nucleic acid double linear DNA(genomic) 22 TCGGGGGCAG GAAAANTTTG GGGTTTTCGN AAAAAAAAAA ANGGGCANAAACCCGGTNAA 60 CNTATTNGTT TTNGGCCCNG AAAGTAAANA ATTTTTTTTT NAAAANATGGAAAAATTGAA 120 AAGGGANANG CAGGGAAGGG NGGNATTTTA TNTCCAANTT TCNGGTTCCTACTTTTTTCC 180 NGATTCTGTC AGTTTCGCTT TAAGCAAAGG NGANGAAGGG NNAGTTTCAGAAGTTAGGCT 240 TGCCTGAGAA AATTTCAATG GGTGGCAATT CTTAGGACTC AGGACAGGATTCAGNGNGGA 300 CTAATNTGCA TTTNGGGATN TGTCCCTGGG GTCCNTAAGN TCCGGACCGGGANAGATGTT 360 CNAGGGGGAG ACCCAANTAA CCCAAAGGAC TGAAATTATC ATGGCAGCNACNNACCAGTA 420 GTTGNTCTGG TAATAGAGCA GATTGCTCAN AAACACGGTT GTTCCATTTGGATATATCCN 480 TGAAGTCCGG CCGTGCGAAA CGATCAGAGC CCGGGAAGAA ATCATCCCAGGCACGGAGCG 540 GGGCAAGGTT TAACGTCCAT GTTCTTTTGC TTGGCGAGCT TCGCCTTCGGAATCCGGAGG 600 CGGCGGCGGT AGCAACCAGC TGAATGAAAG ATGACAGCGG CTCNTTCGGATTGGCTCTGC 660 GGTTAGAGCA CCGCAGGGCC CAGAAAATTG GCCGCGGGCG GGTGTGTTGGTCTTTCTGTG 720 ATTGGCTGGA AGTGGTTAGT GACGGAAAAC TGTGGGCTTT ACCAAATGTAAAACGGAGTA 780 CTAACAAAAA GTAACCAGCG GAAATGCCCC CCTAAACTAA AGGTGGTGTCAGTAGTCTCT 840 CTGGCAGTTT AAATACAAAC NATCTCTTTT TAGGCATTGT TTTGAAAGTCCCCACAAGGN 900 TTTGCAAGTA ANAAGTCG 918 309 base pairs nucleic aciddouble linear DNA (genomic) 23 AGAGAGGGTT TAGCACAGGC AGCNTATTCCCAGTTTGTGC TGTAGAACTG GAACCTCAGG 60 CCTCATTCTG AAATNTGCAG CCNTCCCCAGCATCCTTCNT GGCACAGCNT GGCACAGACN 120 TGNTAAGTGT CTATTAGTGA CTAATACAAAGGAGTATTTC AGAACGTTGG CACATCTCAG 180 CACGTTGCAA CTGGCTGGAG CTGGTTGAGCTCTTGCTGCT TCCATATCCC TTTGTAGCTG 240 CTCTCCACTT TTCTGAACCC CGGGTCCATGTGAAAGTCCC CACAAGGNNC TTTGCAAGTA 300 GAGAAGNCG 309 904 base pairsnucleic acid double linear DNA (genomic) 24 TTTCATTTAA AACNCGGGGGNTGAACCCAA TCTTNANGGT GGCAGTGNGG NNGATCTTAA 60 CGGTTTTTNA GAAAAAAAANTNCTTCGCTC NCACCCCCAA GCCTCCCNTT CTTANCAGCT 120 TTTTTATANG AAAAAAGATGATAACGAAAT TTTAAAAACC GTCGTTAGAG GAAATGAAGG 180 TTCAGCCGAC CATTACCTGANAGTAATGAA GGTNTTCCGG AGGGTTGCCT TCCAATCCCA 240 GATGGATTTG AGTTTCAGGATCAATTCAGT TACCGNTGAC CATCCACCNN CCTCCNGTAT 300 AATCATTNGA TGAGGATGAATGGTGAGTGA GTGATGATGA TGATGATGAT GATGAAGGGA 360 TGAGAAGNAC ACTATGATAACAAGTGTCTC AGTCCACATT AAGGTTTGCC TGNAAATTAG 420 TGCATAAGCC ATGGGAGACAAATTCTTTTC NNACACAATT AATAGTNTCT TANTCCTTCC 480 CATCTTCTCT GCCCCATTCTGTTTTCCACC ACAGGTCTGC AGCGGGCTAC AGCTTCCAGT 540 CTCCAAGCAA ATACCAGAACTGGAGGAGAA AATTCCAGTC CAGTGAGTCA TGGGCAGGGG 600 GAGGGGTGGG GTAAGGGCAGTGGCGCTCAT TCCTNACATG GTGTCTTCTC TTGCCTAGCC 660 TGGGATCTGA GGGCAAGAGAACCTGTAAGC TTGATTTGAT TTCCACTGCT GACTGGAGTC 720 ACTGCCAAGG GATTTGGGACTTCTCCATCT CTCTCTCTAA CCTGAAATCC TTAGGATTCT 780 ATTATTTCAC CGGACCAGAGCTGTAGCAGA GATGAGCTCC AAGTTTGAAA TGAGAAAGGG 840 GAAATTGAGA GCTATGAGCTAGGNGCGAAA GNCCCCACAA AGNNTTTGGC AAGTAGAAAA 900 GNCG 904 883 base pairsnucleic acid double linear DNA (genomic) 25 GGGGGGGGAA ACTTNTTTATNTGGAAAANT TTTGTTTNGG CGGGNAAGGA GTTTTTAANA 60 ANGTTAANGG AAAAAGCTTTTANTTAANAT GACCTTTTTG GGGGAAANAC AAANTTGGTN 120 NGTGTATTNG NGAAAAAGATTTATTATAAG ATTTTTTATA ANATTTTNGG GGGGGAAATA 180 TTTCAAANAA AATTCTGTAACAAAAGGNTT TTTGTTTTTT GTTNTCCAAG NAGTTNTCCA 240 GGTAGTTNTC AACAACNNANGCCNTAGGGA AGGACATCAT ATGGATATTT TCANAGATTT 300 GTTTTTAGGA AACATTNTAAAGTCAAGGTT AAGATGACAG TCAANTCCCA NGAGNGNGGT 360 AACTGTNTGC TTCTTTATTTAAAATTCAAT ATTCAGGATT TCATTTATAC TAACAAGANT 420 AATTACCATC TTAATGAAACATAATTTGAA TAATTTGCAA ACAATNTGAT TTTTCTTGAA 480 TATACATGTT ACTAAAATATTANGGATGCA AATAGNTAAT AAACAAATAG ATANGNAACC 540 ATGGNACACC CCTTCTGTGATTGGNGGGAC NTGGGCATAA GGCTTGTTTG TATAATAATG 600 TTCATATTTT ACATTCTTCCTNNGAGGANG GTCCTCCCTG TTAAGAAAAN GACTCCAGGA 660 TAAGGAGACA GCACCAGTNTAGGAAGTGAG GNTCTGTTTA ATGTCTTAGC AAAGTAGTAA 720 ATGNTGGGAC CATCAGAATAGCCCNTAAGG NTGTGGANAG AACTCTAAAA GCNTGATATA 780 TATATATATA TATATATATATATATATATA TATATATATA TNTATATAAA GAGGCAGTAT 840 TGAAAGACNT NCACCAATNGAGCTGGCNAG CTAGAAGAGG TCG 883 924 base pairs nucleic acid double linearDNA (genomic) 26 TTTGGAAGGN TTTTNAGGAA AGAAANTGTN TTTNAGGGNA GGGAACCCTATTCCGACGGG 60 TTGGGGGAAA ATTTTGGGTT GACCCTTCGT TAAAAAGGGT TNCGGTAAAAGGGGGCNANG 120 TNTTNNAANA AAAATAATAG TAATAGTAGT AGTAATAGTA TTAATAATAATAATAATTGC 180 AGGAATCCTG TNACCNTCAG GAATTGGGGA AGTAGTTTCT TATTTTAGGACCAGGTGTTT 240 TGTTTCAGGG GAGTTATTTT TTGTTTTGTG GATGGGATGA GTGGTNTCAATTGCTTTNAA 300 AAACCTGTAT TAGTTTTGGC ACAGTTAGTG TGTNTCNGNT TCGTTNGAGGAGTTTGAACT 360 GGATGGTAGG CAATGGNTGC ACAGATTCAT AGTGGCCAGA GTTAGAGTAAATGCTTGCGG 420 AGCAGTCAGA ATAGATGAGA NTCAGGGACC CGGCAGATGA TGCAGGGAGAATGTAAGAGC 480 AGAAGGTGGT GGGTAGCATG TGGAATGCAC ATTTCCAGGC GTGACATGANTCGGAACAGC 540 TGTGACTGCT TAGACCAAAG TGATCCCATC AACACGGCCA TTCAGTAAGGAAGGGTCATG 600 GGNTCCCCCC NTCCCTTAGG ATTNACATAC AGATAATGAT TGATTGGTGGACCAGGGGAA 660 TGGGGAAAAA TGTCNTTTTC GTTGGTATAG TCACTGGTAG CTGCCCATGTTTNTATAAAC 720 AAATTNTAAA GAAANTCATT GGTTCATACA CGTAAGAAGA CATCAAAACAGAACTGAGGC 780 AAGTTGGGAA GAGAAATGGG ATTAGTAGGA GAGGGTCAAG AAAAGGCAAAGGTATGTGCA 840 CATGCATGAA TACATTGTAT ACATGTATGA AAGNGCCACA ATGATGANTTACCCCANATG 900 GNNGTTTGGC AAGTAAAAGA GTCG 924 482 base pairs nucleicacid double linear DNA (genomic) 27 TCTCTCCTGA GGGGGGTTTT NTGGANGAATAGAAGAANAN ACCNCCTCTT TGTTTCNTCC 60 TGTGGNGNNC CCTGCTGNTA AAGNNGATTTNCNCGGTGNT ATACANNTAA GAAGGAGGAT 120 CTCTCCCCCC ATTGTNANAG AACCCCGTGTGTGGGGAGGG GGTGTNGCCA CNANCCAGAN 180 NTGGCCCNNG GGTCNTCTCC CCACTCNTNTGNATAACNTC TNNCCTCCAC AAANACCCCA 240 NANAAAANCA CCCCNCNTGT GAGNNCNGCAGANGCGCCCT NTNACAAGAN AAGAGNNCAT 300 GTGNTGTGGC CCTGTGCTNN GACANTNTANACTCTTCTNT NGNGGGGNGN GGNCTGTGGT 360 TTTATAAGAG NGTGTNNCCG TGGGGGGGAGAGTANTCNTT TTATATAGAG AGANAGNGNC 420 CTGTGNAAAC TNCCTCTGAG AAGAGCACCNTGGTGTTCTC TCCCATCTNC TAGNAGGGGA 480 GG 482 460 base pairs nucleic aciddouble linear DNA (genomic) 28 TAGCTTCTCT GTGAGGGGTA GAACTCAAGCTCCCCCATGA ACAGGCTTTG GGGTTCCTGC 60 CATCCCCTGG GGCTGTTCAT TAGGTGCCCACACAGACTTC TCATGCCATG ACTCACACTT 120 GACGTCACAG AGCACACAAA GAGCACAAAAGCAGGCTGAC CACATCCGGC CATGCACACC 180 CCTTTAACAG TCCCAAGCTT TCTCTCTCTCTTCTAAGTCA CTGCCCTGGG AAGACGGTTT 240 CATACCCAAG CTGATGTGCA CTTATTTCTTTGTGTTATTG CTCTGACAGT CTCACAGTGC 300 TCTGCAAACA CTCTGCATTC GCCTTTACCACACCAGAAGA AATTCCTCTT TGTGCAGGGA 360 AAAATACATT CGTCTTAGTA GCTTCTACTTTCCAGCTTGT CCCTAGTCTG TCTGATATGT 420 GGTTACGTAN TGTTAGGGGC CACGGAAGGGGGGGGGGGGG 460 465 base pairs nucleic acid double linear DNA (genomic)29 TCCCAAGACA AGAGGGGCTG AAGAACGGGG GGGGGAAGAA TCAGGAGTGT GTCGCTGCTT 60CCCACATAAA GACGGCACCT ANATCTGTCT CTCTCGGTGT CTCCTCCCCA CCTGGGGCAG 120GGTGAGCTCT CTAGACAAGA GAGAGACTGT CACAGAGAGA GAGAGATGTG TCACCCCTGT 180GGAGATCAGA GNCNCCGACA CCTAGGGGAC AAATGGGGAT CTCTTTTTTT TTTCTCTCTC 240GAGACAGGGG GTCTCTGTGC AACACTTGCT GTTCTGGAGA TGTTCTGTAG ACCAGGGTGT 300CCCCCAACTC AGAGAGCCTC CTCCTTTNCA CAACTGTGTC GCCGCCGCCG CCGCCGCCGC 360CATCACCAGG CTATATTTAC TATTATCTCT ATTACTATTG TTGTGTGTTG TGTTGAGACA 420GGATGCTCAC GCATAACCCT ANCTATCCTA GTGATAGACC CCACC 465 568 base pairsnucleic acid double linear DNA (genomic) 30 TNNCNNTTNC CTGNGGCCGNGTANCTCTGA GNGANAGTNT CCCCGAGAGG GGGGGTCTCA 60 CNNTAGNTNT ANANAGTATNGNGTGCTCGA GTTTNNAGAG AGCTCTCTCT NNNTCTCTCT 120 CCCCNGAGCT ATNGNNTTAGGGNTATGGCA CNNCNCGTCT CTCNNCNCCN TATNGAGNGG 180 TGNGNTATNG GGGNGAGAGTNTCTGCCCGA GACCCACATT CTCNGAGTNN GGNAGAGTNT 240 GGGAGACACA CANCTCCGGGNANATCTNTC TCCNCCCCCC CAGGGGCGGT GGTNCANATN 300 GNCNACAGAG CCNCNGNNTTNTATGTGGAG AGGGGATATC NCANCNCACN CCCNGAGCAC 360 AGGNTCCACA CNCAGAGANGTGTCTCTCCC CANCACACAA GCACNTCTGG TGAGNTCTAN 420 GTTTTGNGAG AGACNNTGCCCTGTCTCCCT TTTCCCCGCT CTNACACACA TGAGAGGGTG 480 TGCACATCTT CCCCATGTCCCTCTCTAAAA CCNCCCCAGA NTTTTGNGGT TNTGTGCAAN 540 ACCCTTTTCA CNCTCANGGGAGATNTTT 568 920 base pairs nucleic acid double linear DNA (genomic) 31GAGGGTTANT TGGCCCAANT CGGCAATCAT CCNGGGAAGA AGANGNCAGG GTTTNGGCAA 60ATCGGAAGAT CAAGGACGCA ATTCGNGGGG GGGGATGGAT AGNNGCNAAA GGGNACNGAA 120AGNNGGATTG GNAGGNAAAA TTAAACGGGA GTTGTAATCC AAAAGGACGA CAAGGCAAAA 180ACAAATCCGG NAGTAAGCAG GAAGCACAGT GAANTTGGGG GAGGCAGNGT GGNGNAANTA 240AAAAATNGTT TTTTTAATCC CAATANGGTC AACANGTAGG CAANTGGATN TATTAGATAT 300TATATCTTAG CGCAAGNTTN TCACCCATTG GTCCAACCCA TATAACATGG CGGTGGTNAA 360TNTNTGAGCN TGGCACAATT TTTNACCCAT TAGTTCCCAA GGCAGATCGC CACCATGCCA 420GAANAAAATC CCAATTCCAT GGTGGCCCAG TGTGTCCAGC CACCAATANT TTCTTGAATT 480CAATTAAATC ACCACATGAA GGAATACATA ACACAATAAC ATCTGATCCA ATTGATAAGA 540TATAATTTGC TCACNTAGAC ATACAAAATC CTGTACATTC CATCTCTTAA GAATATTCAT 600AACAAACTAT AAATGTGTAG AGAGGAATTT TAATATCCAC TTCCATGTTC TCTTGGCTGC 660TCCTCTCTCC CAGTCTCCTC CTCCTCCTTT AAAACTTTTT TCTCCCACCC ATCATTTTTT 720TTTGTCCNAA GGACGGGCCT TGTTNTATCC TGNACCTGCN TTCGTCTGCA TAAGGCCATC 780ATCCCACAGG CAGGACTGGA GCAATGGCTC ATTGGTTAAG AGCACTTGCT GATCTTGAAG 840AAGACCAGGG TGCAATTCTC AGAGCACTNC ACTGCTNCAC ACTGAAAGAC CCCACNNGTA 900GGTTTGGCAA GTAGAAGAGA 920 176 base pairs nucleic acid double linear DNA(genomic) 32 TTGACCATAT TATTTTTATT CACGTTGGGA CAAAAGAGCA AACGCAAAGGATAGGAAACG 60 AAAGGAATTA ATTTCCTTTC AATAGAGATA TCGGTTTTTT TTAGAGGGAAAAAATTGAGT 120 ATTAGAAAAT AAAAATAGGT TTCGGAATTT CCGGAAAGAC CACTAAATTGTAGGTT 176 336 base pairs nucleic acid double linear DNA (genomic) 33AAAAGGGNTN CCGAANAAAA ANAATTNGGA TCTTNTGGGG GCCCNGAGGN AAAAAAAANA 60NTAANCNGGG GGNGACCCAG NGAANAGACA AATTNTTTTN CCNGGAGTCC TTGGGGTGNN 120ANGCCAAACN GNCGTTTANN GNAANNNGNC GNGNTACCNC TTCGGAGNGG GGGCGCTGNA 180AAAGAATNGT GAGAATNCNG TTACNNGTGT TGNTTNATCN GAGATAGTNG TNTGTAACAA 240CCCCGATTCA GCCNGAAAGT TACGCATATG CGNANCGTTG TGTGAATCGA ACCTGGNNAA 300AACAGACCCA TNGNCAAGNG GCAGACCNAA CGGAAC 336 92 base pairs nucleic aciddouble linear DNA (genomic) 34 TGAATAAGGG TACAAAGATT GTGTTTCAGAGGAGAGAGGT AACAAGAAAA GACTCCTAAC 60 GCAATGGCCA GAGGGCCAAG AAAAAGGGAA AA92 838 base pairs nucleic acid double linear DNA (genomic) 35 GGNGTNATTTTCTTCTNGTG AANTCTTTNC CAAATCCGNG GGTNTGNCCC ANNGCCCCNN 60 TTTATACACNNNATTACNCN TNNNCCAAAA CNCTATATGT NTCGANATGT CCCATNTTAA 120 ANATATGNGACTCAGTTTGA GTNTCCCCAN NTTGGNGTTG GGGTATNTGG GTAAANACAN 180 NGACCCTCTNNGGNGNTTTA TTTATATATN NGNCCCNATA TAACNCAGAG ATCTGTGTAA 240 AAAATATNNCNNTTCGCGGG GNGGGAGATT TCTCTCTGNN GTAGNGCNCT CNNCTGAGAN 300 GCACAGNGCCCTGTGTTNTN TCCCCCTCNC CGAAAANAAT TTTNTNCAAA AANANANAAT 360 ATNNACANACCCCNANAAAT ATNCCCCTTN TCTACCNCCC CTCAAANACA CCNCNNTTTT 420 TTTTTNCCCCTCAGAAATNT TTNTAATNTG GGNNAAAAAA ATCTNNGNTG GNNTTNTCCC 480 CCCNTTTNNAGNCGCCCCCT NNAAACCCCC NCTNTTNANA GANAAATATG TANACTCNTA 540 TTTAAAAAANAACANTTTTT GTTNGGGCTN GGGTNTNCCA NCCCTTCACT CTCTTTGTGG 600 GTNTNCCTTNCCATATNCCC CCTNTTTGAG ACNTTTAAAN AACCCTCTCC CTAATTCCTC 660 CNCCCNCTGTTTCCCCCTTT TNNAAAAACN TCNGGCCCCT TNGCCCCCCT TTTCTNACTC 720 CCTCTTNTCCNGAGATTTTT TCCTCNTNNT NNCTAATTCC NTTNTTCNAN TCTANATNNC 780 NNTGTTNCNANCGCANGNTN NCCCCNCCTT NNNCTNAATT NTNGGGNAGG TTCCAACC 838 314 base pairsnucleic acid double linear DNA (genomic) 36 CAAACCAGAA ATGGCCCAAGGGTCATCTCC CCACTCAGTA TGAATAACAT CTAACCTCCA 60 CAAAAACCCC AAAAAAAAACACCCCAGATG TGAGAACAGC AGAAGCGCCC TATAACAAGA 120 AAAGAGAACA TGTGATGTGGCCCTGTGCTA AGACAATATA AACTCTTCTA TAGAGGGGAG 180 AGGACTGTGG TTTTATAAGAGAGTGTAACC GTGGGGGGGA GAGTAATCAT TTTTATATAG 240 AGAGAAAGAG ACCTGTGAAAACTACCTCTG AGAAGAGCAC CATGGTGTTC TCTCCCATCT 300 ACTAGAAGGG GAGG 314 226base pairs nucleic acid double linear DNA (genomic) 37 AGGGGGGGAAACCCCTTCGC CNCGGGCCTA TCGNAANTTT TNNTCCACCG TAAAANATTT 60 NCCANGNGCNCCATGTANGG ATTGNGGGNG TAGTGGGGGG AACGATTNTG GAGGGGCCTA 120 AAAGGNANATAGAGGACGTA TTGTATTTGG TTTTGCNGAG CCAGTACCTT NGAAAAAGGT 180 TGGTATTTTTGATCCGGCAA CAACCACNGT GGTAGNGTGT TTTTTT 226 843 base pairs nucleic aciddouble linear DNA (genomic) 38 GAATTAAAAC GGGAAAGATT GGAATTCAATTTCTTACAGC CAAAAGCTAG ACCGGGCATA 60 TAGGAGATTA TTTCGATTTA GCACCTTCCAAAGCCTGCCC CAGATTTAAA GTTTAGGGGT 120 ATTATTTAAA AGCAGGTTCC GGGAAGTTCCAAGATAGGCC TAGAGGTAAT GGTATGCAAG 180 CAGTCCTAGG TTTCAGAAGA GTTCAAACACGGGTCTTCAG GAAAAGACGG AAAGTGTAGA 240 TTGATCAGGC CAGCAATCAT ACAACAGTGTTTGTTGTAGT ATTACCTTTT CTAATGGTTG 300 TCACTGAAAG GAGATTATTC TAGGTTTGGAGATACAAAAT TAAAAGAATA AACCCCAAAA 360 GGCCACAGAC CCAGGGTAAG CCCTGTAGCCAGGACTAGCA GGCCATAAAG AAAAAGGAGC 420 ACAGGAAACA CTGTCCAGGC AGGACTGGCAAGCCATAAAG ATAAGGAAAA GGAATGCAGG 480 AACCAGCCTG AGTTAATGAG AAAAATTAATGGGACGTCTG GCAGGAAGAC ATCTCCCCCT 540 AGCACACTCC GGGCCATATC TCAACTAGGTGTCCTCCAGC CCCTGACTTA TAGCACGTAC 600 TCTATCTGCT TTGTTATCAC AGATATGTTTGAATGAGCCA ATTGTATGTA ACCACGCCAA 660 AACCCCCTAG CTTTGTCTAT ATAACCGTCTGACTTTTGAG TTTCGTGTTC AACTCCTCTG 720 TATCTTGGGT GAGACACGTG TTGGCCCGGAGCTTCGTTAT TATTAAACGA CCTCTTGCTA 780 TTACATCATG ACCAGTCTGG TCCTGTTGTAAGACATTGGC AAAAGAGCCT GAAAACTAGA 840 AAA 843 943 base pairs nucleic aciddouble linear DNA (genomic) 39 TTTTTTTTTT GGAAAAACGG GTTTAATAAGGGGNANGNAT CCGAACCCCC ACTCGGGNGA 60 AAGGAAANAA AANAATANGG GGGGAANAANGANTTGGNGG TAATGCTTTA CCACGACAAA 120 CTAGTCCCAT TNTTCGGGGG GGGAAAGGGANGGCATGAAT AATGGGGTGA AGGCNGGCAC 180 CCACCCCATT TTTTCGGGGG TAAGTCNGTTTTTTTTTGGT ANATCAAAGT TCCTTTCGGA 240 ANATGTCCGT TTNATCCAAG GNGTTTTGGGTGTTNNAATT AGNATTTNNG NGAGTTTCAA 300 AAGTTTGTGT TCNNGAGNAG TTTGTAATTGGTTCAGCNGG TTTTTTTGTG NCAGGAAAGC 360 AGACCCNTGT TTGGGAGGGA GATCCAATTTTNTAGTTCCC ATTTGGCTGT TTCCTTAGTA 420 ATGGGTCTGC AGACAGTNTG AAGTNTATGAGTTGGTCCCT TCTCNTATCA GCCCGGGGTG 480 GCATTNTGTC CAAAGGAGGA AATCCAGCAGCCAGACTAGA TTTCAGTNTC CTTTNTAACA 540 GGGAAGTTAG ACACACCCGG CCAGTTGCAGCCTTTCCACC CCCAANGAGT GAACCCTGCC 600 NTTTCAGNTT TNACCCAATT TACTTTCGTTGGCTTAGCAT GCAGANTCTT TGGCTCCATG 660 CCCGGAGCAG CTGACATGGG AGGCTTTGAAACTTCCATTA TCATAGAATG GCAGGCAGGT 720 CNTTTGCGGT TAAAACCAGG AGCNTGGGCCAATGAGATGG NTCANTGAGC AAAGGCGCTT 780 ACTGCCAACC CTGATGCCNT CAGTTTAGTNTTGGAATTCA CAGGGTAGAA GTTGAAAACC 840 TTTGACTCTT CAAAAGTTGT CCTGTAGCAGGGCAGTGGTG GTGCANACNT TTAATTGNNG 900 TACTTGTGAT AGTCCCACAA GGANCTTNGCAAGTAAGAAG TCG 943 904 base pairs nucleic acid double linear DNA(genomic) 40 ACTTCTCTAC TTGCCATGGT CCTTGTGGAA TCTTTCAATC TGTGTCCTTAGAACGCTAAG 60 CTAAGACTTG ACCTTGGCTC CCAGGGCGGG CTGGGACTTG GCCACCCCGTGAAAAGGGCT 120 CTTTCTCAGG CAGGTGTTTT CGTTTAAGAA AATAAACCAT CCAAGTCCGGGCAGACTGAG 180 AGCTACACAC CCCTCCAAGC CAATCTGGAG TGGCTCTGCC CAACCCCCACTGCTGGGAAA 240 ACATGGCTGC CTCAGCACCT CCCTAAATGA AGGGAACAGA GTGTCTCCTGTGGCCTTGAA 300 AATATTAATA AATGAGACTT AACCTGATGG CTCAAGGCTC TCAGGGGGCTTTTTTTTGTT 360 TTTACACACT CTGTGGAGCT GTTACAAGGT CAGTCAGTCA TTTGCATGGGACAGACAATC 420 TGTTTTAATA TTTTATATGT TTGTCTTTTA AAAAACCTAA GATCTATATCTTTTTACATT 480 TTATTGTTTT GTTCAAAAAA AAAAGTTTTA CACAATGATC AAAAAGTTCAAATGAAGTCT 540 TTTTTAAACC TCTCTCCTGC CAAAGGAAAC CAAGCAAACT TTTTCCAGAAACCTGATAAG 600 AATATCTCCC TTTTACCCTG GAAACATTAA AAATAAGGAT CCCTGAATTAAAAATTCTAT 660 TCCAGAATCC TAATTTTATT TTTTATTAAA AAAAAATAAA ACCCCCTTAACTGACGGGCG 720 GTTTTTAAAT CACCTGCCTT CAAAACCCCC CTGGAAATTT TTAAAATTTTTTTTTTGTTC 780 CCCAACATTC CTCCCCCCCT AATAACACCT GATTGATACC CACCAATTTTCCACTGTGGG 840 TGATTGAGGT GGTCCCCCCT CTTTTTTGCC GTTTGATTTC CCCCGTTAAAAAATTTAGAA 900 AAAG 904 917 base pairs nucleic acid double linear DNA(genomic) 41 AAGGGGGGNG AAATTTAGNG GACNAAAATT ATTCCTTAAG GGCCNCCTTTCTTCAGGGAA 60 NANGGGGGAA GGAGATANTN CGGCCCTTGT CCGCCTTTTN GGANACGATAGGGNCGGTTC 120 GGNTTGGAAA TTTTTCCTCC AAAATTNCCA ACAAAAATNG TTTTTCCCCTTCCTTCAAAA 180 AGAAAATTGG TTTTTTTGNN GGCTTNGGGG NGTCNGGAAG TCANAACCCNGNGTATTATT 240 GCNTTCCAGC CCCACCCGTN AGTTCATTGG TAATTCCTAT TCGTTCGGNTCAANATAATT 300 CGGNACTTCC GCTTCCNAAT GGATCCCTTC AANGATTNGG TTTTTCCGGATTATCGCAAG 360 TCCCCNGGTT NTCCAATCCG GAGCGCNTCG GATATTTCCG GNTNTCCGTGCNTTTCTAGC 420 CCCACCCCCA NGACCACCNT TGGTTNTTTA GGTGGGTCTT TGATCCGCTTCACGTTGCTT 480 CAGTGACNTA GATCCTTNTT CGGTCTTTCC GGCTCATTTT AGTCTCGAGTTATTCTCAGC 540 TGTGTTANAA AAAAACANNA NAANAANCTC CGCCTCGCCC TTCCGNTTCGGTTCTTTCCG 600 CNNGCNTTCG GGCGGGCNGT NTCTGCCTTC TCCACGTGAC GNTTNTTCGGCNTCCCAGTN 660 ACCCCCTCCN TCCACGCCTT CNTCCAGNTT CAGCTTNTGT GCTCGTCCCGGNTGTGCCGC 720 CANNTNGTGT CAATTCCNGA CCGCGGCGGG GGCCGGGCAG NTGGGGNATNTAGGGCGGGC 780 AGACAGTCGG CCNATCTCCA TAGGCCGTTC CCTATNCTNC CCTGATTTTTTTAAACCATT 840 TCCAAAAGCT CGCTGTCCTC TTTCCGGGNC TTCCATTNNG GNGTNTCCANAAGGAAGNAA 900 GNCNAGTAAA GGANCTC 917 835 base pairs nucleic acid doublelinear DNA (genomic) 42 GGNCCCCTAN NGATTGGCCN TTGATCAAGA NGGGACCATCCTGNACCTGG NGGTNGNTGT 60 TTCCGCTTGG GACGGAGATG GTTGTTTTTG CGGAGTAGTTTCNGNGGGTT TGAGGCGCGG 120 NTANTTTTTT TGTTNTGGTC CAGACCGTTT TGATTTAGCCGCNGCNGACA GTAATGGGGC 180 GATACCTCAG NTCCTTGTGA ACCCAGGGTG CAGNTGGTTCAGCAGGATAG ATGTACAGCC 240 TCCGAACTTT TCAATTCCCN GACTAACCAT TGATGTCAAGTTGAGTGTTT AAATGCTTGC 300 TACCAAGCTG GTTGGTAACC TGAGTTCAGT CCCTGGAACCCACATGGGGA GAGAGAACAT 360 GCTTCTGTAA CTTGTCCCCT AACTACCCCC AATACACGCATGCGCGCGCG CGCGCACACA 420 CACACACACA CACACACACA CACACAGAGA GAGAGAGAGAGAGAGAGAGA GAGAGAAGCA 480 CAAACAATAA AAGAAAAAAA TAAAATCTCA TTTAATTTTCATTAGTATAA TACCTTGATT 540 CTTTGAATGA CAGCAAGATA AAGTAAACCA AAGCACACTGTAGAAGGGAT TACGCAACTG 600 AAAAGTGACA ATCCTTACTC CAGCCCTTCC TGCTATGTTGGCAGTCTTGC TGGGAGCCAT 660 TGATCTAATC AGTTTTATTT GAGGCAGGGG CTCATGTAGCCCAGGAGGAT GGTCAAATCC 720 ATAGCTCATC TGAGGATGAG TTTGAACCTC TGACCCTCCTCATTCTCCAG TTCTCCATAT 780 CCTGAGTGCT GGCACTGAAA GACNCCACNA GTAGCCTTGGCAGGCTAGAA ANGNT 835 924 base pairs nucleic acid double linear DNA(genomic) 43 GTNTTTTNGC CGNGGGAATT TAAGGGNGAT TTGGAGACTT TNGAATTTTCGAANGTTCCA 60 AAATAGANNT TNAGGNCAAT GGGNTTGGGG CAGNGGNGCT TTTTTAAATCANANAAGTAT 120 TAGATTTNTA TGGAAACCCT GGGGGTTCCA GTTTAATCCC TTCATCATCTTGAAATATNA 180 CTTGTTTATG GGAANGGTGN GATAGCAGCC NGAAACAGAG GTTTTTATTATTACTGTTAG 240 AGANGAGGAT TGGGGAATAG AACAATGAGA GTCTTGGTAA TATTNTTCNGGAAACAACNG 300 ACATAATTGG AACATTAAGG AAATATATCC ATGCATTCTG TACTTGCAAATTGCTCCAAG 360 GAAGATGGAG AGTATTGTAT TTCAGATAGA GATANGACTA TACCTGTTATTTTTTTCATT 420 ATAGCAACAT TAAAAAAGAT AGTAATCTAA TTTCACATAA CCATTACTACTAAAGTATAT 480 ATGTANTCTT TGTTTATCAG GTTTTACTTC TCAGAAATTG CAGCATCTCCTACAGAGCCT 540 GTCAAATGAG ACNGCATAGA TCCCCAGAGA ACAGAGAGAC TGGGAAATCATTGAAATTAC 600 ACAATCCTAT CCCAAATGTT TGCGTAGACT CAAGCTCGTA TCAGCTCATAAGATCAGTGT 660 GTGTGTGTGT TTGTGTGTGT GTGTGTCCCG CACATGCTTG AGTATGCATGTGTGCATGCA 720 TGTGTGTATG TCTATTGCAT TAGTAGAGAT GTTAAGGTTG AATGTATTTTCTGCTCATGG 780 TCATTGTAAG ATATTGTGCT GTATGTGATA AGAATCAATG TAACAAGGCTGGAGAGATGA 840 CTTCAGCTGT TAAAGGCTAG ACTCACTACC AAAAATAGNG CNATCAGTGTGAANTTCCCC 900 ACAGGAGCTT AGCAAGNTAA TAGG 924 435 base pairs nucleicacid double linear DNA (genomic) 44 GATTCCAGAG AGAGGAGTGA ACTGGCAGATAAGGCAGTCA GCATAATGGC TTAGATACCA 60 TGTGCTTTCG CTCACTATGC ACCCATGACACAAGATCACA GGGTACAGGC CTGGACCATG 120 GCAGAGTATA CACTGGTTGG GTAAATGAAGAGGAGAGACA GAGTGGGAAG TCGGCTTAGT 180 GGATATGGAC TTCAAATTTG ATGAACAAGCAATTCAAATG AGTATCGTGG GCTTGANTGG 240 TATGAAGACC CGTTTGCAAA GCAGTGGTCATAAGAGAGAA AAGAGAGAGA GAGAGAGAGA 300 GAGAGAGAGA GAGAGAGNAA GAGAGAGAGNGTGTGTTGTT GTTGTTGTTG TTGTTGTTTA 360 TTGGTTNATA ACAANATNTA CCTTTGGGCNCTTTNGAAAG ACTNTNCACA AAGGAGCTTG 420 NCAAGCTAGA AAGGT 435 919 base pairsnucleic acid double linear DNA (genomic) 45 CCCCNGTTAC CCNGANGTTTACNNGTTGGA TTAAANGGGN NNNAAAACGG GTGGGGNNAA 60 ACGAATTTTT TGTNCNCGACCCNTCCCCGG TTGGGGNTGG NGAAATAAGT TTTAAGGTGG 120 GAAANGGAAA GGAAATAAAAANATTTTTTT TNAAGGAAGT TCCTTNCCAC AAAAAANTNG 180 NTTNGTTCAG TAGGGTTCGGGCCCGGGAGG NAAGGCAANN TTGAANTNCA NTTAAAAATT 240 NCCNGGAANG TACCTTGGGNAGGGATTACC NTGNAATTTN TTTAAGAAAA NNTGGGTNTT 300 TTGGGGNGAT TTTNNGCCCCACCTGGACCA NTTTNGGGAA ANGCAGAAAC GTTCCAGNGN 360 GTTTTCCTTC CAGAGAGAGGGTTAGGTTCC TTCAGGGGNT TCCAAGGACG GGGACCAGAA 420 NGTGAAACAA ACCAGGNTNTGAAGAGACCA GNCGGGGGGG GGGGAGGGGG CCGTTNTAGA 480 TAGATTGAAC CTGCAGAGTTGCCTGTTACC TGAAGTTGTC ACCNTTTNAC CNACANACTT 540 NATAAANNTN TGNTGACCATNTCAGCAAGT GTCACCTTCG TTGCCAGGAC ACAAGTTTCT 600 TAAAGCTTAT TTCAGTNTCACCCGCTGGGG AGANACATTC AGGGCATGGG CGTCCCCCAG 660 CCNTCGGGGA GAATGTGGGAGGTGGCGATG TGGGAGGGAT TCGAGAGAAG AGAATGCTTA 720 AGAACCATCC AGGGAACCTGTGCGTTTGAA GGTNTGAGTT ACACACAGGC TGCTCAGGAA 780 GGAGCTAGAG CTCCAAATAGGAGCTGTGAT CAGGCTGTGT GTGTGTGCTG GAAGGGCCAG 840 TTAGCAGAGG TTGTNTTGACCACCCAGNCT ATTGAATTGN GNNTNNTCCC AAANGGANNT 900 TTGGCAAGTT AATGAAGTC 919915 base pairs nucleic acid double linear DNA (genomic) 46 TTTTTTGGAATNTTGGAACC NCGNTTTGGA AGAAGACCTT TNNNNTNCAA TTGGGGAANA 60 ATAACCGGGGCCAAACCTTG GGAAGGGGGG AAAANATTCC NGGGGGGAGG TAATTTNTTG 120 GNNGGNAGGGGNGGAGGTTA NTATNNCGGT TGNGGAAGTT TGGAATTGTC CNAANGGATT 180 TTGTTTAAAAAGAGGNTTGC NGGGCNTGNT CCCTTCAACC ANGAGGTGGG GCCNTTGCAT 240 TTATTTTCCTTTTAACNTTT GAAGGTGAAG CCGGGTTATT TNTTTGTCCT TCGTACATTT 300 ATCACCACGGNGTTTAAAAN GTNTTTTTAT TTCGNTTTNA TGGAGGNGAG TTAAATNTCN 360 ATTTCCAATTAAACCTCNGT GAAACCTTCT TTGATCCTGC CTNGTGTTTC CTGAGTGNGA 420 CATACCTGCNTAGTTNTGGC CTTCCCTTTC CTTNTCGTCC TTCTTCCATT CCCTTCCGAA 480 GATTCCTGAAGGAGTGAAGG TTTGGGAAAG GGGGAGGGAC AGAGTGTCCA GGGCTTGCGT 540 GTCAGTAGACANNAAANAGC CGNAGGGCAG CCCGGGGTGA AACCACAAGG CAGAGGCCCC 600 AGGGTAGACAGCTGACAGGC CCGCCCACTT TGGCTCCTGC NTTCGCTGTC TCACCCCAGA 660 ATTTTCCTGGCAGGAGTGGA AGAAGTTGGT ATCGAGTCTT TGAGCCCTGA CTCATTNTCT 720 GTCCTAGCTGGGTGCTCCTC AGTTACATCT CCAAGTGTCT CTCAGGGGTT CAGTGTTAGC 780 CACATGGCTGCCTCAGNTCA AACCGGAAAC CCAAGAGGCG GAAACATGCT TCATTTAATT 840 CCCATCTGGGGACCCNTACA AATTTANGGN TTGTACTNAN GGATTNCCAC AANGNNAAAG 900 GCNAGNTAGANAGGT 915 849 base pairs nucleic acid double linear DNA (genomic) 47GTTAAANANG AAAAAGNGGG GGTGACAGGG GGNGANACCC NTTGCGCCGG GCTATGGATT 60NTNGGCACCG ANAAGATTTN CAGGNGACAN GGAAGGTGGN NGGGGANGGG GGAAAGTTTN 120GAGGGGCCAA AAGGANAAGG AGGANGATTG ATTGGTTNGG GAGCAGTACT TGGAAAGAGT 180GTGTTNGATC GGNAAACAAC CACGNGNAGN GNGTTTTTGT TGCAGCAGAG ANAAGNGAGA 240AAAAGATNTC AGGAGATCTT GATTTTTTTC GGGTCGAGCT ANGTTGGGGG ATGNGAGGGN 300ACAATTCACA AGATTTGTTC ACAGGGAGNT CNAGGAGGTG GTCCCANTAG CCGGTAGGGG 360GGTTTTCTCA ANAAATGGGN TCAGTCAGGT GNTTGCCTAG ATCTTTCATT AGTTCCTCCC 420TTCAAAGGGA NTTTGAAGGA GTGCTTTGTC CTGTGGAGCA ATTGACTCAA TCAATAAACN 480TAAGTAATCT CCCGGANTAC TGNNGANGCG TTCCCAGAGA GGTCCCCCGT AGTNACCAGT 540GAATCACAAT TTCCTAACCA TANGANTNTT GTTAATCTCA CCACATAAAC CCACAATTCT 600CGCGTCCTTN GTGATGGTTT CAAAGTCNGG AATATNTTTT CCTCCATCCC TCCTTTCCTT 660CCTCCTTNTA TCCCTCCCTT CCTTTTTTCC TTTCACAGGA TCTCANNATG CAGCCCAGTC 720AGGCCTTAAA CTTGTGATCC TCCTGTCTCA GCCTCCTAGG TGTTAAGATG ACCCAAATGT 780AAACCATGTC CAGNNACTTC CTCCTAATCC CATCTTCAGA TATCCTTTAA GACCAAATTA 840AATATTAAC 849 925 base pairs nucleic acid double linear DNA (genomic) 48AAAAAAANAA ATNTTGGNGG ACCNAANACC ACCAATGGGT TTTGGGGTCC GANCGNNCAA 60ACNTGNTTTC ANTGTTNTTC TGGNTTTNTT TGNNTAAACT TGGGGTTTTA AGGGTTNAAG 120GTTCCAAACC CNATGTTTTC GCNCAATTTA GGCGGGGNGG GGAATCCNTT TGGGGANGTT 180TNAGTATCTA GTTAAGAGGG GCCATTTNGA GATTGACACC TGAGTTAAAC TTCNGAACNN 240AGNTGTNTAA TNAACCCGTG AAGGGGCTGA GGGGNGTTGG TTANGATNCT CAATNNTAGG 300GNAAAAANNA ATGTGGTANG GAGACAGTAG NNTANTCGGA NCAANTNCGC ATCGGCCNTT 360NNATTAATAA GCAGNCAATT GAGGAGGTTA TCCACGACAG NGANAGGTGC AGACCCCACG 420CACACTGTGA CAGTGGTTTA TGTNACANNA TNTCGGGAGN GATGGNGCCA CACCNACTGA 480GTTCCGTTTT GTTCGGNTGA AGGTAGGNCA ANACTGGCAN AGGTGTTNGG GGGCNAGACG 540NGAGATGNGG NTTGAGCNTT CAGACCNAGN TNCANGGNNN NGGACNANGG TCCCCNGNGC 600CNTTCTAGCC TNGAGCAGNT TCNAGAGAAN TATTCGNCGG GTATAGGTCG CCCCNANGAC 660GCNAAACGAC CGNGAGCGAG GGCGGAACAG CCAATCAGTT CGANTTATCG TGTNTGTTNG 720CGGGGTTTGA TCCCNGAGTT AGNTCAATGA GCCCANAACC CTGAGTGGAG GNACCGTCAT 780GGGAGGAGAG GNGAGTCACC NGGTACCTGG CATACNGATG GACCATCCAG TANTTGGATN 840GGAGGGCGAT ATNGTNANTC TTAGGGGNTC TCCTGAGGAG GGNATACCCG TGAGTTCCGT 900AAGGGCGTTN GCAAGTAANA AGTCG 925 827 base pairs nucleic acid doublelinear DNA (genomic) 49 GCCAGTTGCC CTCAGATGNC CNATACCCCA CNGGGGGNGTCTCNCCCCTC TCTCAANTGT 60 ACACACACTT CCCCATAGAC ACNGGGGACC ATAGCTCTAGGGGGAAAACA AAATNTTATN 120 TGTGTGTGCA CNTGTGNGTG TGTGTGNTGC CCCAAACACAGGGGTNTCTC TTCCCCAGNG 180 GCCCTAAAAT GTTNTNTGTT CNCCACTNGG NCCTCATNTNNACATACCCC CCNNGNCTCN 240 GNCCCNNATA CCCNGACANN GAATGTGTGN NTNCCCATNNGCGCTNTCAC CACCACAGNT 300 TTTNTAANAC ATCTCTCCCC NNNATATCTN TTNTTTNNTNNGGGTCTCAA TGGAGACNAC 360 ATATACACNA GTGTGTNAGA CACACCCCCA CACCCCAAATGNGCGGGGGG AGGGCTCTTA 420 GCGCAANGAG AGNGCAGNGT GCTTACTCCT CGCCCCCTCTAGAAAACTCA CACTNTTNAG 480 ATCTCGGGAC TCNNCCTCAG CNCATTCTCT ATCTCCCANAAANACACAGA GNNACCCTNT 540 TTGNGAAAAC TCANNTGTGT ATAGTGCTCT GNGTGTNACCCCNAGNCCAC ACCCCCATAA 600 NANATNTNTC TCTCAAAACA TGTGCATGNG CGTGTAACACTCNCCATCTC TCGGGCNNGC 660 TCTCCCCNTN ACATCTCTCG NGNNAANANA AATATATCCCCTCNNTTANC CCCCGTGTCC 720 NGGANAATAT TNCCCCCCTG NGACCANTCC CTCCCCGGAGACCNANCCCC CCCGTGGANA 780 CCCCCCCCNG GNATCAACCC CCCCGGGTAN ACAACCCCCGGAACCCC 827 899 base pairs nucleic acid double linear DNA (genomic) 50AAAAATTGTA AGGAGTTGGG GGNATCCCCC ATAATTNAAA NAGGGAACAA NCCNTAAAGG 60GAGGGNNGGG AANGGCCAAN ATTGGNTTAA AAANAGTANG TTTGGTTGAT CCANACACAA 120GGAATTTGTT ANAATTTTNN TAATGGAAAT NGGGCACTTC AATTGGGANG ATAAAACCCC 180AGGAAGTGAT ACCNGGGTTA TCAAGTNAAA CNTGATTCTT GGNGNNGAGG GAAAGGATAT 240TGAATTTGAG TGAGTGCAGG TGAAGTGAGA CTTGGGAGNA CAGGTCATGC CCACCCAAGG 300GAGGAGCAAG GGNTGGGCAG TGTAGGTGGT GNGGTGGTCC TTCCTGGGGT GGGCGGGGAG 360ACAGATGAGA ACGTTATTGG AGGACAGGCA CAAGTGTTAC TGAAATGCAA ATCCCTGTAG 420ATNTGGAAAA GTTCTGGNTT CAGGCTTGAT GCTTGGGCCG GCAACTGTGN ACTTTCCCTG 480TACGTTCAGC CCCCCCACCC TTACGGAAGT TNTCGTCACT GAGANTAGTG GCTAATCAGA 540GTCTTCAATG GACCTGCCAA TCAGAAAGGA AGGCGGGCTT TTCCGGGTGC NTAGGTGTAG 600GATTCGCTCA GTAGTTAAGC AGTCTTAACT GGTTNTGGCT GCTGTGCTCT CTGTCCTGCC 660GTTGGATTNT NTGAGGCATG TTCAGGCAAG CTCCAAAGTT GCGACATGGT GAGCACAGGG 720GCAGGGGGGG CGGGCGGACG GGCAGGGGAC TGAGCAGTGG GAGCTGGTGT GGTGGGTCTT 780TCCCGGGGCT GAGTTGGAAT CCGCGGCTAC CCGTGAGGTC TTAGCCACTC ACTAGACCCA 840GCGGCAGTTT CTGAATAACT TTCCTTGTAG GGGCTGCAAC TCTTGAAAGA CCCCACCAG 899 852base pairs nucleic acid double linear DNA (genomic) 51 AAAACATTGGCNAGACTTGT AATAATTNCC NGTTNGGGGA AAANAGNGGN NTGNGCTTCG 60 GGGGNGGGGANCCGAGGTTC CCCCCAAATT TCTTANNAAT TGAGGGANAT TNANGGGGGG 120 AACCGANNGNTCNNNAAGGN GGGGTTTTTC CCNTTNGCCC CCTTGGGGNT TNACAANTTG 180 ACCNTNAGTTAACGGGGANA ACCCGCCNTG TCCTNNGGGA GGGGGGTTCC CTNGGGAGTT 240 NCGTNGTGGGTTTCAGTTCG GACCAGGTCG TTNACTCGAA AACNGGTCCG CNGTATNCAC 300 CCGGTNGGCNGNCTGTTGAN NGCTAACGNG GTAAGTATTT TCATGTGTCC GAACGTGTTA 360 GACTCCAAGTATGGCCATGT GCANGAACCN CCGGTTAGCN AGACGCAGAG CGTGATCNGN 420 GGAGGNTCTNCAGGNGTCCA ACCNGGNANG NCAAGATNCG TCGACACTGG CAGNACCCAN 480 TGGNGACTGGNNGATCAGAG GGAGNCAGGT ACGCNGGGAA ACAGAGTTGN TGNATTGGAT 540 CCGGNANACGGACANNCNAG NGGGNCNGTN GTTTGGTATG TGNGCTAGNA GGANGCCAGG 600 NACAGTCGGAAAGGNTGTCG GGAGGNTCNG ATCATGTCNT ACATAACCNC TCGTGAGTAT 660 GCGGTGGNTGTGGAGTTGNG CAGGCGGCAG NTAACGCACC AGAGAATTCN GATNTNTCCG 720 CAGATCGACAGATNTGTTAG GTGGGTCTCT GACGTTNAGG NCGANAGGAN NNGGGAGNGG 780 ATAACANTNTCACACAGAAT TTCACTGAGG CTGAAAGACC CCANTTGTAA NTGNCCAAGC 840 TAGCTGAAAT CG852 967 base pairs nucleic acid double linear DNA (genomic) 52AAANCCTTCC CGGNGGGGTT AAAANAGATT ANGGGTTTTC CGNGGGGAAN CCCCNNCCNC 60CGCCTTCGTA ATTTGTCCCC AAGAAAAATT CCCGCGCCCN CAAAAANNAG GGGANTNGGG 120GAAATNTTAG NGGCCANAAG NAAAAAAGAN AATTGTTTNG TTTTGGAGNC CACNNCGNAA 180NAGGGGGTNT TAAACGCAAN AACACCGGGG GGGGGNTTTT TNTTNCAACG CGAAAAANGC 240GGAAAAAGAT TTCAGGANAC NTGAATTTTT TNGGGTCGAA GTTCAGTGGG GGGATTGGGG 300NGNNAAAATT TNANACNGAT TATTGGTCCN ACCTTTCTCC TTCCCNTCCC TNCCAAAATT 360TTNTCCAATT TTCTTCTTTN TNTCCATTTC CCCACCAGGA GGGAGTCACC CACCTTNTGC 420NGCAACATTC TCAGGGTTCT TCATTCTCAG TGTAACAGCA GNTCTTCNGG TTCTNGGGNA 480NTCAGAAACT GGGCTGAATC ATGTCCAGAG TTGCNGAGTT CCCACATAAC AGATAGTGTT 540NGNGAGATTC TCAGTCTAGA ACCATGTGAG CCAATCCCCA TCAAATCTCT TCTCTCANGN 600ATAAATNNAA ACATNCTTAN GGGAGGCTCT ATTTCTATGG AGAAACCAGN ACCCATATTT 660NGGGCTGGAT CACTCTTTAT TTCCATTATG GGATGTTTAA CAGTAATCCT GGTCTGCATT 720CCNTAGGTGC CAGTAGCCAT CTCCTAGTTG TGACAATCAT CATTTTCTGG GGATGAGGGT 780GGAGAAGGGG GCAGATATCA AAACTATCCT GNATCTAAGA AATGTTAGTT GAAATGAAGT 840TGTCATGGGT CATAAAGTCT AGGATAAAGA GTGATGAGAT GTCACTAACC CAACTCTTTT 900GGCCAGAACT CAATGAGGTN GTCCCATTTG ANTTACCCCA AAGGNGCNTT AGCAAGTAAA 960AGGGNCG 967 700 base pairs nucleic acid double linear DNA (genomic) 53GGNGTGCTGG GATTATAGAT GCACTCCCCC AAATCCAGCT TTTTACCTGA TACCGGAGGA 60AGGAACGGAA GTCCNCCGGC TTGCACCGGA AGCAGTTTCA CCCACTGAGC CATCTCCCTG 120GTCTGTCTGT CTCAGCTTCC TGAGCTGGTG TTATGGCTGT GCACCACCAT AGCTGGCTTC 180TTTATTATTT ATGTATGACT NGGGTCTNTC TGGGGGTCTG TTAGNCAGTC TGTTAACTAC 240CATCTTTTGN CTCAGGCAGC TGCAACAGAA AACAACNGGC TGTAAATNGT TTTGACAAAT 300GGGTCTGGGG AGAAGTCTGT NATGCAGGGA GATCTNGAGT TTATNCAGAG GAAAAGGTGT 360CTNTCAGNGN ATCTAGGGNA GCATNTCCTN TCNGCGTCTT GGTTTGGGNG AANGANGGAT 420CAAGAGCCCC NNAGCNNNNN AANTTNCCNT CGAGCAGCCC AGGGATTTTN GCTTTCAACG 480NANCTNNAGG GAACCCCCNA NCAACCTNGG CNACAATTGG GGNNTTTCCC CCNCCCCCCC 540CGATTACTTT TNCAAACCNT TGCCACNCCC TCGCNCNATG CCNANCCCCC AAAACGTCGT 600NNTTCATAAN CNCNNCNCTC NCNCTTNNCC CATGGGGNGC ACACTCCCTT CNCCCNCNTN 660TNTTAACNGG NGGCGCAAGN CCTTTCTTNC CCCCTNCCCC 700 229 base pairs nucleicacid double linear DNA (genomic) 54 NCNACGAGAN GTCAANGTGN AANCTGNCGATGATNAAAAN AACCGANCTT AGGGTGNCAA 60 NGGGTTACCC AGGANGGGGN CAAAGCAAGNTCCAGGCCCA TNANGGACCT GCTGGTNCAT 120 NGCCNGNAAA NACCTACTTA TCCTNGAANAGCCCGAAANG TCCGCTNNGA CCANNTAAGT 180 NCANNNCAAN ANGNACCACN CCNTTAACACCACCGTATGA NCCCNAANT 229 465 base pairs nucleic acid double linear DNA(genomic) 55 CCCCTTTCGN NGGCCTCAAT NANTNATTGN CTACCCNANA GTGGCGGTCTNNCATCATGA 60 CAAATAAANC AGCCTTCATG AAATACGATG GCGGGGGGAT TAGAGGNNTTTNTTGAAAGA 120 GCTGAAGGGG CTTGCAACCC CATAAGAACA ACAATGCCAA CCACCCAGAGCTTCNAGGGC 180 ATTAAAACAC TACTGAAAGA CTATACATGG ACTGACCCTG GNCTCCAACTGCATATGTAG 240 CAGAGCAAGA GCCTNGTTGG NGCACCAGTG GAAGGGGAAG CCCTTGNTCCTGCCAAGGTT 300 GGNCTCCCAG NCCAGGGGTA ATNTNGGGGG CGGNGGAGCA GTAAGGGAGGGTGGATGGCG 360 GGGCTACCCA TATNGNGTGG CGGAGGAGAT CGNNGCTNAT GGACAGGAAACTGGNAAACG 420 GGAATNACAT TGGANATCTC NATAAAGNNN NCATTTCTTA TTCNA 465 564base pairs nucleic acid double linear DNA (genomic) 56 TTGGGGCCGNTNAACTCTGN GTNNNAGTAT NCCCNANAGG GGGGGTCTCA CANCGGGTCN 60 CACCNCATNTGNGGGNGCCC NTTCNCNACA ACACATTTTG TCNGGNGGTT ATAGNGAGAG 120 CACANATTTTGAGAGTCNCC NGANAGGGGA GAGAGACNCA CACNAGTCTC TTCTCCCCGT 180 GTTCGCGAGNGNACNCTTCT CTNCACATCT ANAGTATANC CCAGNGTCAC ATATGTGGCG 240 GGGGGGTNGTGTCAGNNACA GNGTTTCCCC CNCCNGTNTT TCCCCCTNCC CCCCCCNCAG 300 GGGNAGACAANGTNNTAGAG AGAACAGGGG TTATCCACAC ATCNCACTGN GNGGCACAGG 360 AGGANNANANTTGTGCTNAG AGCCCCTGCN CTTCTGGTGG TANCTCTGGG GCCCATATTC 420 TCTNCTCTGGGTCCCCCCCG GGGGGGTGTN NCCCTCNCCG GGAGAGAGTN TTAGAGANAA 480 ATCTCCATCNCANATGANAA AATNTGNGGG NGAGAANCCC GGGGGATATC ACTNTTTTAN 540 AANNGACCCCACCCCCCCCC CCCT 564 822 base pairs nucleic acid double linear DNA(genomic) 57 GATTTGCNCT CATATNTCNT TTACCAAACA GNGGGNGTCT GCCCCCCTGTNATANACCTC 60 TTGTTNTCGC GGGGTGCTNN TNGGGGCCCC CCNTGTAGAA AAAGAACANNNGNTGTGGGN 120 GGGGGATTTC TCTCTGNTGT AGANCTNTNC NCTGAGACAC ACAGNGCCCTGTGTGGGGTC 180 CCCCTCNCCG AAAAAGANAC CCCNAAAAAA AAAAAAAAAN AGACCGCGNGGGGNNGAAAA 240 ATATCTCTNG NNATCTTCTC TCTAANCTCG CTTTTANTCC TCAGAAAACCCCACCCCNCC 300 NCTCTNCCCA GAAATATNAT ACANNNNGNG TTCCCCTNCC CAAAACCCCAAAGGGNNTCC 360 CCTCTCNTCT NCCCCNAATA CTCTTCCNCC CCTTNATTCT CNTATCTCTNNGGACTCANA 420 CTCTAAAACA CANGNNNCTT NTCTGTGCCG CAATNTNTTN TGTNACANGGCNCCCTGAAA 480 AAAACCCCCG TGTTCTCCAC ATCNCCTCTN TNATATCTCT GCCCCCTTCCNCTATATCNC 540 TGNGTTTATA ATTTCCAAGG AGAATGTNCN CAGGGGGGCC CCAATCTCCCCCCCTNGTTT 600 CNNCGAGNAG GGCTCTTTTN TATATTTTTN NTCNAAACCN CCNTTGTCCTTTTAAATNGG 660 CNTTNACNCC CNGNCCCNCC CAACNNCCCG ANCGGGGGAA ACGTTCCCCANTTTTCCNTT 720 TCCCCCCGCC CNCCCNNACC CCAATNCCCT TTTTTCGCGT TCCGGGGGCCCTGTTTCCCT 780 AANCCCGGAA TNAANTNCNT TNTTCAANCC CCCCCCTTTT TT 822 553base pairs nucleic acid double linear DNA (genomic) 58 TTTGGGTGCGGTCTCCTCTG TGTTAGTGTA TCCCCCATAG GGGGGGTCTC ACAGGGAGCC 60 CTTCTCTTTTGGGGGGTTAT ACACAGGGGA CACACATGTG ATATAGAGAG AACACATGAG 120 AGTGGGAGAGTGGGGGGGTG GGTGGAAGTG AGAAACAGAG AGAGAGAGAC TTTATTTTTT 180 GTGGTGTAAAATGTGTTGAA TCTCTGGTTT GATAAATTTT ACACATTGGG GTTTGTGTAG 240 ATCCCTGATCTCTCTCCTAT CCCCATTCTC TTTCAGAGAT GTGTCTCTGG ATTCTCAGAG 300 AGATTTTCTGGTCTCACATG TTTGGTCCCT TATGTTCTCA CTCTCTCTTC TTTATTCTCT 360 GATACATGTGCTCTTCCCCC TTGGGTCTTC TCTCTGTCTC TGTCTCCCCC CCCATGATAC 420 ATAGAGTGTGTTTTCTCCCC GGGGTTTCCC TTGTTCACAA GAAGAGCTCT GGGGAATCTC 480 TATCTTCTCAAGGGTATAGC CCCCCAGTCC CCAGGCCCTT TTTCTTGGAA TTTTGGAGGG 540 GGTTCCCCATTTT 553 904 base pairs nucleic acid double linear DNA (genomic) 59GGGATTTGCT CTCAGATGGT AGTTTACGTA AACTGTGGGT GTCTTGCCTC TCTCTCAAAA 60CATGTGCGCG TTTCTGGGCC CGTGCGCGTT TTCTGTGCTC CTCCTTCTTC ACTTCTTTGT 120CGCGGGGGCG CTCGCCCCTG TGTTTTCTGT GCTCCTCGGG GAGATGCTCT CCCTTGGGGC 180TGTGGGGCTC TGTGGCGGTG GTGGCGGTGT CCTCGATACC GTGCTTTTTT GTTTTCTCGA 240GATCTTACTT TTTCCTCTCC CCCTTGTGTG TTTCTTGGGT ATACACGAGA TTGTGTGTGT 300CTCTTTTCTT ACCCCCTCTC TAGTTTATAT TCACACTTAC TCTCTCTCTT TTCTTTTTCT 360CTTTAGATTC TATCCTTTGT GCACTTTTTC TATTGTGCTC TAGATTTCTC CCCTTTTTGT 420TTATTTCTCT TCTCCCTGTG TCCAGTGTGG TGAAAAAGAC CCTTATTAAA TTTAGACTTG 480TGCGCTCTCT TCTTAAATTT CATGTGTTCT ACAGTCTCTC TGCGCTTTAG ATATTTTTAG 540AAGCGCCTAA ATCTTTTAAA AACGTGTGAG ATCTCTTTTT TTTTTTTACA CTCCTTTGTT 600TTTTCTTACT CCTCAGGGGC ATATAAACCC CCCTCTCCTT TAATATTTCT CACTCTCTTT 660CTTTTCAAAA AAATTTTTCA ATCTAAATCC AAATTTTTTT TTTTTTTTGG TGGCCCCTAA 720TTTTTGGGAA CGGCCCCCCC CCCTCCTCTG GGCCCTCATT GGGGGGATTT TTTTAATTCC 780CGTAAATAAA AAGGGTCGGG CCCTTCTCCC CCCGTGGGGT AATTAATCAA GGATTTTAGG 840GTTGGTAAAA ATTTCGGGTT TTGATGGTTT TGCCCCCCCC TTAACCCCTC TTTTTTTTTT 900TTTT 904 698 base pairs nucleic acid double linear DNA (genomic) 60CTCAGCACTG AAAGAGATAG ATTAAAAACA AAACAAAACA ACAACCAAAA AAATACAAAC 60AAACAAACAA AAAAAAACCC CAAACAAGTC GCTCAACTGT CTTGAGTCAA TAGATTTTAA 120AAAATGAGTT AAGGTTAGGG TTAGGTTAGG GTTAGGGTAT AGCTCAGGCA GTAAGGTACT 180TGCCAAGAAT GTTTGAGGAC CTAAGTTTGN CTTTTTTCTT TCTTTCTTNT GAAACAGGGT 240TTCTCTGTGT AGCCTTTGNT ATAGACCAAG GCTGGCTTCG AACTCAGAGG ATCCACCTGC 300CTCTGNCTCC GAGTGNCAGA ATTAAAGGCA TGTGCCATCA CTGTCCAGCT CTTAGGTATT 360CATTTTTCAG CTTATAGTCT TTTGGCAAGG GATGCCAGGG NAGGAACCAG AGGCAGGGTT 420GAAAAACAGG CCACNGNGGG GGGAACGCTG CTTCCCCGGG TTATTTTCTT GGGTCANATC 480NTGTGGCCTT CCNGGGGGGT CTTTCCCCTT TCAAAATTNT TTGGGNTTGG GGNGGGGTCC 540AAATNANTTT TTTNGGCCGG GTTTNGGGGN CCCCCCNNTT TGGNTTTTTT TTTAGAAGGC 600CCGGNGGGGA NAAACCCCCC GGACTAAAAA AAAAAGGGGG GGANCCCCCC NGGGGNGGAA 660TTTTTCCCGN CCCTNAAAAG NAAAAATTTT TNTTTTCC 698 851 base pairs nucleicacid double linear DNA (genomic) 61 GAAANAANTC GGGAGAAAAA NAAANNNCCNTTAAGAGCTT GCCCCCANAG AAAAANTANN 60 AANTNAAAAA CTGNTAGACC ANNNGAAAAGGAAGCGCAGT NANAAAATGG TTCCTACGGG 120 TTAANTAAGA AGCANGACNG AAAGANNGNNTNNATNTAAC CGGGGNTAGN AAACGGCCCN 180 CTTGTANNAG GACCNAATCG AANTAGTACGATCATGNTAC ANAGGGAAGG GGACGTTACC 240 CNCGGANGAA ACCCGGCACA AGATCTCNNAAGGGAGAAGA TTCTGAACGN NANNAANCCA 300 CAAGGAAATT ACTGTGGANA CGGGAGGAATCNATNGTNAT NNAGNNNAGC TGGNCACTTT 360 GANAAGGCAT CGATANAANT GATGATGGNTCAGGCGAAAG AGCATACGTA AAACCAAGCA 420 AGGNGGAATA GTCATANAAC CATGNAAAAAACNTTCAATA AAAGATNNCC NGAATATTGA 480 TCNGTANNNA ANAACNCCCG GTGGCCGTGATTCCTTTTTT AACGGCAAAC AGCANNTTAG 540 TTTCAGATCA CCCAGATCAT CGNTGNAGATNCCATNGATG TTNTTGAAAC TNANCTNGAG 600 GATTCAAGAA NNGNTGACAT GGTGAAATGATGTACAAATN ACAACANAGA NCGTCGAGAT 660 NNTATTCCCC CNGNATGNAN GGACNTCTTATGATGAANAC CTTATACCAG ACTCAAGTAN 720 AACNATATGA TCCCATGAGG GNGGNNACCCAGGNAGTCAN GAANAAATAC CNGAGAGTTA 780 AATGCNTTTT TTTGTNTGNG AACCCANTGCCCGACCTNTC AAANAGAAGC ANAGCCCNAA 840 AATTAATCCA A 851 936 base pairsnucleic acid double linear DNA (genomic) 62 CTAAGGAAAA GGTTTTAGGAGGGAAAACCA ATAGGCCCTT GAGTTCTTAT TCTTAAGACA 60 TTGTAAAGGA AAGGTTTAGGGGAAAAATTA CCAGCCCGAT CCATTAGGGT TCCAAAAGAA 120 CCGTTCTTCC ATAAAGGCCAGAGTTCACCA TGAGTAACCA GGATGTTTCT TCGGACCTTA 180 TAAATATATT TTGAGGGGTTCATGGAATTG GGTTGCCATT TGGTAGTTGG TAGCCTACCC 240 TGCTCCTTCC CAGTGTTGGATGCAGATATG CGCCCTGTTG GTTTTGAGTA GTTTTGAGAT 300 CAGTCAATTT TAGGTTTTATGGCAAGCATT TATTCATCCC CACATTTTCT GCCAGGGTGT 360 AGTAAGTGAG TTCTTACAGAGCAGAGAGAA GGAGCAATCT GTGTTATCAA ATCAACTAGC 420 ACCAAGCACA CCAAGCAGCCAATCCTTAGA AGGAAGAAGC AAACACTTGG GTATCCTTCC 480 ATGGCTAGGA AATCTTCATGGCTCACGAAC CTTGGGATTT CCCTGTCAGG GTAGAATACA 540 AGCAGCTGAG ACCGAACAGGTATGGGTGGC ATGTCGAGAC AGGAAAAGAA CCTGTGTCTG 600 GGGAGAGGTG TGTGCTACAAAGCCAGAGAG AGGAACAGAT AGGGAGGGGT GTGCTGCACC 660 ATCATGGAGG GGGACAGACGATTTGTCCCC AAGGAAAAGC TCCCTTTATG AGAGTTCTTA 720 CTGAATTTGG GAATGACATGGGAGACCAAG GGCCAAAGTC CAGATGAGCA GAGTGGGGAG 780 GAGGGTTGGA AAGTTCCAAGGAGAGAGGCG TGGGGGTAAG GGAAGCTCGC AGGGCTCCGC 840 CTCTGCCAGT GACCTTGGACCGCTTTCTCT GAGGATCAGA GTTATCTGTA GGGGAGATGA 900 GGTTGAAAGA TACCCACAATAACTTTGGCA AGTAGA 936 911 base pairs nucleic acid double linear DNA(genomic) 63 GGGAATTTAA GGGNGATTTG GAGACTTTNG AATTTTCGAA NGTTCCAAAATAGANNTTNA 60 GGNCAATGGG NTTGGGGCAG NGGNGCTTTT TTAAATCANA NAAGTATTAGATTTNTATGG 120 AAACCCTGGG GGTTCCAGTT TAATCCCTTC ATCATCTTGA AATATNACTTGTTTATGGGA 180 ANGGTGNGAT AGCAGCCNGA AACAGAGGTT TTTATTATTA CTGTTAGAGANGAGGATTGG 240 GGAATAGAAC AATGAGAGTC TTGGTAATAT TNTTCNGGAA ACAACNGACATAATTGGAAC 300 ATTAAGGAAA TATATCCATG CATTCTGTAC TTGCAAATTG CTCCAAGGAAGATGGAGAGT 360 ATTGTATTTC AGATAGAGAT ANGACTATAC CTGTTATTTT TTTCATTATAGCAACATTAA 420 AAAAGATAGT AATCTAATTT CACATAACCA TTACTACTAA AGTATATATGTANTCTTTGT 480 TTATCAGGTT TTACTTCTCA GAAATTGCAG CATCTCCTAC AGAGCCTGTCAAATGAGACN 540 GCATAGATCC CCAGAGAACA GAGAGACTGG GAAATCATTG AAATTACACAATCCTATCCC 600 AAATGTTTGC GTAGACTCAA GCTCGTATCA GCTCATAAGA TCAGTGTGTGTGTGTGTTTG 660 TGTGTGTGTG TGTCCCGCAC ATGCTTGAGT ATGCATGTGT GCATGCATGTGTGTATGTCT 720 ATTGCATTAG TAGAGATGTT AAGGTTGAAT GTATTTTCTG CTCATGGTCATTGTAAGATA 780 TTGTGCTGTA TGTGATAAGA ATCAATGTAA CAAGGCTGGA GAGATGACTTCAGCTGTTAA 840 AGGCTAGACT CACTACCAAA AATAGNGCNA TCAGTGTGAA NTTCCCCACAGGAGCTTAGC 900 AAGNTAATAG G 911 781 base pairs nucleic acid doublelinear DNA (genomic) 64 TTCAGGGGTA ATCCTAAGGT AAACGGACAA AGTAAAGGGGAGGTTGGACC AATAAAGGGG 60 AAAAATAAAA GATTAACCGG ATGTTCCCTG GAACGACAAATTGCCTTGGA AGTTTCCTAT 120 ACGGAAAAAA ATGAACAAGT TTCCTGTAAA GCAGGTAGCCGGAACGTTTC TAGGCTATAA 180 ATTTAACTGG CCTTATATTT ACAAAGTCTA AACATTTTACTGGGGCATTA CAATTTTATA 240 ACACTAATTA GATCATGTGT GTACACCCAC AGTCTGACAGACAGGGTATT TTTTCCTTCT 300 TATCCCAAGT GAGTTTAACC TTCCTTCTCC ACATTTATTGCCATGTGCAA TGCGTAGCTT 360 CTATTAACTC CTGATTATTG ATTGAACTTT ATGAGACATAAGAATGTACT TGACAACAGC 420 ATGTGAGAAA GGGAAAGTTG AGGGACTGAG TGTAATAGAGACTGATAAGA AATGAATGGG 480 CTGTGTCTGA CTCTTATCCA ACATTCCAAT TCTTCAAGTCTAAAGGTGAA GGGTCATTTT 540 CAATCTACTA AGTTTGAATA TGATTTGTGC TCCTGGTGTCTACAGAGTAT TAGGAAATGT 600 TTGGTTTGTT AGGTCATTAG GGTAGGGCTC TTATGATAGAATTCTTGTGG CTTTACATGG 660 AAAGGCAGAG AGAATACACC CACCCTAAAC ATTTCTGCCATTGTGCAATA CAGTAAGGTA 720 TATTTCTTTC TTTTTATTAA CTATTTGGTG ATAGTGACAAACAACTAGAC TTCATATGTG 780 A 781 389 base pairs nucleic acid doublelinear DNA (genomic) 65 TTGCTCTTAG GAGTTTCCTA ATACATCCCA AACTCAAATATATAAAGCAT TTGACTTGTT 60 CTATGCCCTA GGGGGCGGGG GGAAGCTAAG CCAGCTTTTTTTAACATTTA AAATGTTAAT 120 TCCATTTTAA ATGCACAGAT GTTTTTATTT CATAAGGGTTTCAATGTGCA TGAATGCTGC 180 AATATTCCTG TTACCAAAGC TAGTATAAAT AAAAATAGATAAACGTGGAA ATTACTTAGA 240 GTTTCTGTCA TTAACGTTTC CTTCCTCAGT TGACAACATAAATGCGCTGC TGAGAAGCCA 300 GTTTGCATCT GTCAGGATCA ATTTCCCATT ATGCCAGTCATATTAATTAC TAGTCAATTA 360 GTTGATTTTT ATTTTTGACA TATACATGT 389 340 basepairs nucleic acid double linear DNA (genomic) 66 AAATCGGGNT TNCGCGATTCGGTAATGACG NCNNATCCGT AAANNCATNC GCCGNNATNC 60 NATTNGAAAA TNCCGGGNGCAANNCGATGT CTNATTGAGG TNNCAGANCC ATCCGGCACA 120 GGCAATANGN AAAAAANGGGAGTTTCACAA TGTNTNTGAA TNTGNANCCA TTGGGCCCNA 180 AAAANTCCTN CGNTNNATGAACCTTNNCGT NCAAAANTTT GGTNCGACNC AGCNGCTTTG 240 CNAGCNTTNA ATAAACACCGGNNTCCANAA TGNNACCAGN GNTGTTTNTN TCNANTNGCA 300 TNNCNNTTTG GAANCCCNCTTTTCCCAAAA CNTTNAAAAA 340 557 base pairs nucleic acid double linear DNA(genomic) 67 AGTCCGGGNA TGGTGGCANA TGCTTTTCAT NCCAGCACTT GGGAAGGCAAAAAACAGTTA 60 NACCTNAGGT TTANCCCAGN CTTTATTAGN ACCCCGTGTT CTNAAACACAAACNACAAAA 120 NTTTGNGGGN NTTTAAGTGN AAACACTGTG TAAAACCTTG GCCCTGATGNAGGGNTCTCC 180 TTTNGAACAG AAAATGTTTG AAGANTCCNA AAACATGTTG GGATGCCANACGNGTTNTTG 240 NGCATCCATC TCAACGANGT TTTGNGAATA AATGGCAGGT NAAACTAGTACATCATCATG 300 TNGNANCCAC CGGGCNTGCA GATTTGTGGT GGGAACCAAG TCCTCCCATAAAACAGGCTC 360 CTGTGGTACN AACAGGGCTG GANCCACNGA ATCAGTGCAG NTCTGGACACCTGTCTGGCC 420 GGANGGNCTG GNCTAAGTNA ANNCAGGGGG GGCAAGAGCA TNGGANCNAACGNCAGAAAN 480 CGNCCCNCCC GGTGAGCTNT TCCATGCCTN NCCTCGNTTT ATTTGGCACTGGGCATGTCC 540 CAACTNAACT TAGGATG 557 302 base pairs nucleic acid doublelinear DNA (genomic) 68 GCCTATAAGT TTTGATTCCA TTCGTGAAAA TTTTTCCTATATCCCGAANA GTCCACTTAT 60 TACTACTGCG GCCTATTTGG AAACTAACCG AAATTCAGTTAGTTCCCTAG TAGCCTGCTC 120 TTGTAATATG TGTACTTTTC AATATTATAA AAAATTGGTCAGCAGATCTG AGTAAAACAG 180 GTGAAATTCC GATCGGTAGT CCAATTTGGT TAAAGAACAGGATATCCAGT GGTCCAAGGC 240 TCCAGTTTTG AACTCAAACA ATTATCAACC AGCTGNAAGCCCTATAGNAG TACGNAGCCC 300 AT 302 820 base pairs nucleic acid doublelinear DNA (genomic) 69 GACTGCCTTT TTTTTCTTCC CAAGGATACC CTGCAGCACCCAACAGTAAA AGACTTCATA 60 AATAGGCAGC TTGGAGAAGA AGGCATTACC ACTGAAGCCATATTAAATTT CTTCCCTAAC 120 GGTCCCCGAG AGAACCAAGC TGATGACATG ACCAGCTTTGACTGGAGGGA TATATTCAAC 180 ATCACTGACC GCTTCTGCGC CTGGCTAATC AATACCTGGAGGTAAGAGGC AGCAATCCAC 240 CCGAGGACCA TAGTGAACCT CTTAATGTCA TGGGTGAGGCTAGAGACCTG TTAGCCAGTC 300 AGCTGGCACT GGATTCAGTC TTTCATCCTT CGCACAAAGTGGTAAGGGTG CCATGGCCAT 360 CTGACAGACT TGCGTGCGAC TGTCCTCACA TCTCGATAACTTCATGACTC CTCTGGCTCC 420 CCCTCTTTCC CTTCCAGCAC ACATCCATTC CCAGCTATCTCCGGGCTGCC ATTGTCTAAT 480 GACTTCTGTT GGCCGGTGTC CGCCAAACCT TTGAGTTGAGCTCATTGATT GTGGACACTT 540 TACTCAAAGT TTAACAGCAT GTGAAAGACC CCGCTGACGGGTAGNAATCA CTCAGAGGAN 600 CCTCCAAGGA ACAGCGGGCC ACAAGNGGTN AACTNAANAGGGTTATTGNT AACGGGNNCC 660 GGGANCNAGT AATCGGGNCT GGCCCCAANT AAGGGTTTGGGCTTTATTNN CNGGGACAAA 720 AACCGCAAAA AAANNAAACG CCTTNTTGTA TTAAAANGCANGNTTTTAGC CTTGGCCTGA 780 AATGGNGNTA AGNTACGGCC CNCNGTCAAT TCCTACTATA820 955 base pairs nucleic acid double linear DNA (genomic) 70AANCCGANAN TTTNAAAAAA CAANNANAAN GGGCCANGAN NTNAATANTT TCTNAAAAAA 60NGANTACANG NACACGGCAG GGNNGTTTAG TCAGAATANA ATNNAGNGNN AACCATTGNC 120TTTTGAGCAG GGTTTATNGG NCTACGTTGA CCCAAGTCAC ANTGNTANCA GAGATNANNG 180AGGGGGNGGG AAGGGGTTNG GNTTTCCACA GCNTTNAAGT CAGAANTNGG AGAGACATTT 240NGCCNTGATT CANGNCTTTN CCTCCTTATT TCCNANCNTC NCATTAANAN NAGAAAAGAG 300TNTTTTNTTG TNTTGNGNAC AGGTGCACAA GTTTAGNANA GAGGAGACAN TGTNTAGAGA 360TCAGATACGG ATGAGAGTTT CCGGGGANAG TATGNGGGGA TTTTCAGTCA GNNCACTACC 420CAGAANGGAT TCAGTCGNGA GGAGNCAGGG ANGGGGTGNT GGAGTTNAGA CCGANAGAGC 480GGNTAGCATN TAATGNNNAG AGAACACACA TNTTTTGGAT TTNAGAGACG NCCAAANCGC 540TATACANGAT NTNTCGNTAN AGGGTGAAGA GTGAAGAAAG TGATGTCTCC ANCGCANACN 600GGAACANGCN GCGANTTTCT TAGAGACCNA GGTTTTGATA NAGGGAAAGT CTATTCAAGC 660CTCCCGTANA CTTGTAGGNC AAGNAAATAN TGCNNATTAT GAGNCCGTTG TTNTCAAACC 720ANGTCCCCTA TAGCAGCAAA NAGTTGNCAG AAANTCNCAC AGAGNTCCCC CGTGAGATNG 780NNNTTATNGN GGACACGATG TCATCAAGAG GGAGTNNTGN ACTGTGACTC CAGTCCTGTT 840GAAGNGCATA GTAGACCATT CGCCGTGTTC ACCNACANTC AGCCNCTACC AGCNGAAAGA 900GNAAAGGAGA GAGTTCGCAT ATGANAGACC CCACGGGTAG TTTGCAAGTA ATGAG 955 886base pairs nucleic acid double linear DNA (genomic) 71 NTNGAAGNANAAATTNGNAA AAANNCCNAA AACCTCCAAA TTTGCTACCA NTCTTCNACG 60 GTNGACTTTTAAACAAAAGG AGGGGGGGGT TCTTNTTCAA ATGGGCCCCT TCCCAATCCT 120 GTTCCCNAGGCAATTGTTTC TTNTTTCANC NTTCAACGGT TTTTGGGTTC CATCCAACTT 180 TTATTTNACCCNTTGAGTTT CCTGGCCGGN GCCTAGGGAC CTCCTTTTTA CNTGGGCCAG 240 TTCCCGTTCAAGACNACCCG GCGGTTAGTG GNCATGGGGA GATGGCCCCA TGANTCCAAG 300 ACAACTGTATTCCCGGTTTT TTAGTATTTC CAAGCTTCCC GCCAATTTTT CTTCCTTCCG 360 CTTCCAGACAGTTTTGCCAG TNACGTGATT CGGTTCCGAG GCCCCAGCAC CATGGAGANT 420 GCGCGCTGTANTCTTAGAAG GGCATTCTTC CGCCCCACNT CCCGGTNTAG CCNGAAGGCC 480 CACGGAGCAACGAGGAGAGC GACGNTNTCT CCACAGCCGT GGCTTTTTTA TGGTTGGCAC 540 TTAAGGNTTCGCCGCCATTT TGTCCGTTCN TNGAGTTATT GTGTTGAGGG CAAGATCTTA 600 CGATTGGGTTTTGAAGGCAT GGGTAGTGGC TTGTAGACGC ATGGCAGGAG TTGGGATTCG 660 TTTGGGGACACTGAGGGGAA GCCGNTTCTT GGGGTGTGTC CCCTNGACGC TGTTGTGGGT 720 GGGGACCGGAACTAGACGTG CCGGGCTGCG GCGCCCAGCG TGGGAGGACT CGCGCGGGCT 780 GGCAGCCGGGCTGGGTGTCC CGGCGCCTCA CTCACATTTT TTGCCACGAT TGTCGCCTGG 840 TTTGATTTCCCACCAATCCC CCAGACCGTG CACGAGGAGT AGAAGC 886 900 base pairs nucleic aciddouble linear DNA (genomic) 72 GGGNGTTNGC TCTCAGATGC NAGNTACNNNTCAGGGGGNG TCTCACGAGA AAANCTNATG 60 TGTGGGGGNT ANTNTGTATC CCCTNNNCTCNCTCGAGANC CCNNNTCTCG ANATTTTGGN 120 GACCNGGGGC CGGGGCCCAG ANACTCNCCACCCCATATGG NGACCCTNTA TAAGTGTCNN 180 CCAGGGNNTG TTTTGGGNAA AATATANCNNANAGNGGTGT NTNTNANATC TCGGGGGGTG 240 ACAGACCCNN ATTTTTTTTT ATAAAGACCCGGGGCATNTT CTCNGCCCCN TCTCCTCNGC 300 TACANGNNAC CCACACACAG TGTGTCTCCTCTCAGCCCCC TGGCACACTT TNTNTNGANT 360 CNGNGGGGAT ATGAGATTCN CNAGACTGGGNCCGCNNTAN TANNCNCCCC CNTGTCTCCT 420 CTCATAGTGT NGTGTCCCCC CCTCACCCNNTNTTGNGGTN CCCTACACCC ACACAATNTA 480 GACTCTNCCC NCCNTCNGCT NTGNGACNCACANCTGNAAA TCCCGNNNCN CAAAAAGGGC 540 TGTNCTCCTC TCTNTTACNG GGNGGTCNCCCNCNNNNGAC TCTNAAANGT CCCTCNCAAA 600 AGGGACNCTT TTCTATACAC NCTTANTTTNCCTCCTTTGT NTNGCAAAAA ANNANCCTGT 660 GTTNCCCCCC NCTTTATNAT NTTTNTTTTNTTCCCCAAAC TAANCTTTTA GGNNTNANCT 720 TCCGGGGCCC CAACCCCAAA ATCCCANTNTTCTTTTNTNT TGGTTGGGGT GTCAAAATTC 780 CTNCCCCTAA ANTTTTGAAC CCCCTTTAATTCCCCCCCCC GGNTNAAGGC CCNACTTCCC 840 TNGGNTNTTT TCNCTAAAAA ATTTTTTGTNGCCCTCCCTG GGAAATCCCC GGTATTCCTC 900 1033 base pairs nucleic acid doublelinear DNA (genomic) 73 CCTACGTTCA CCTATGCGTA ACAGATCTGC TGTGTCAGGAGCCTCCTACC CTCGCGCATC 60 CTGACCCCCA ACCACGTCCT CTTATCTGAT GACTGGTCATCTTCCCAAGT CATACACCTC 120 ACCAGATCAC TCGTGGGGAT CTCTAGGCCA CCTCCTGTGGTACCCTAGGC CTTGGATCAC 180 TACTAACTCC TGCATCGTGG TAACCTCAAT GGCTGATCTTGAGGATGCAG TCTGGAGTTC 240 GACTCCATCA GGAAGCCACA TGGGGAGGTG GCTGAATGCCACAGGCACCT ACCACATAAT 300 GCTTCATGTC CCCACAATAG TGTCATCAAG CANCGNTATCTCCCTTTGTA CCTGNCTATC 360 ACAGTAGGCC CTATGTGTTG AAGACAGAAA CGTTCTNATACTCAAAATAG CTACCTACTT 420 TCATCTTTAG NAAAGTTATC ACCAGAGATT TCATCACATGNCTNGGCTTA NGTATTTTAT 480 CCCCTTTCTG AACTATTTAT CACGGGCAGA AAATNTACTGATTATCCCTG TATCATGACA 540 TCGTGCTGNA GAGAAGACCC GAGTGGGCAG CATGGNGATCCAAGGAGACA AGGGAAACCA 600 AGCAGCTATA CATAGGATGT CAGCAGCAAG CCCTTCCCTGCCCACGTCAG ACTAAACCCT 660 TCAGTCCCTT CATCTTTTCC TAGAAGGGTT TGTAATTTCTGTTGATTGTG CACCAGCGCT 720 TCCCAATCGC TGAACATCTT TCTTCGAATG TGACTCAAAGTGAGTGCACC GAGTCTGGCT 780 AATGTCCTCT GCTCCTCTTA ACCTCTGTGG CACACTCCTCCTAACACATG TGTGTCGTCT 840 TGTTCCACAG TGGCCCCACG GTACTGGTTT CAATATAGCTTATGTATGAG CAATAAGGGC 900 TATGTATTTT TTTTTTTCAG ACACTGTTCC TTTTGTATTCAACAACCTCC TCACATACTC 960 AGCCGNACCA CATTTCTTCC AGGTCAAAAA CCATCTCTCCAATTTGTTAT GAATTACTCC 1020 TNCAAGTTCA GGT 1033 883 base pairs nucleicacid double linear DNA (genomic) 74 GGGGGGNNAA NAATTTCCCA AAAANNGNNGGNCCCNTTTT TTATCCAGTT TNNGGTTGAA 60 NATCTCNCCC CGGTTTNAAA ACCCNCAATGGGGAAAAAGG TACANCNGAT TNTTTATNGG 120 TTTGGGCGGA GGGGGAAATT TTTTTGGTTTTTTTNTTTNN GGGATTTTTG AAAAAAAAAN 180 GAANTTTTTA GGTTTCCCNN ANGTAATTTATTTCAATGGA CCATTTTTGG GGTTCTCCCT 240 TTTGTAANAN GTTAAAAANA AGGGANTTCCAANNTTNCTT TTCAGTTTCC AGTTTCACCT 300 TCNGTAGCAG ACCCAGTTTT CATTTTGAGNTGGTNCCNAA AAGGNTTCCC AACTATGTTC 360 AATACCACAG GCAGCCTGCA GGAGGGAGAATGGGTATGTA TTTAACAGCA TTTGACCAAA 420 TTATAAGAGC AGAGAGGAGC TTTACCAGGGACAGGAAGGC AAAAGAGCTG AATNTTAAAC 480 AAAAGAATAA GAACAGGATN TCATCTGTGAGCTGTCACAG TGGGTTTGCA GAGCAGGAGA 540 ACACAGACAG GATTAGCTAT AAAGTTGTTACATTAGTTAT TNTATTGGAG CATACAATAC 600 TTAAATAGTT CTAGGGCAAG AGAAATGAACAGAAATGACC TTATAAGAGC CAGAGCTGTA 660 GCCACAGCTT TCTTTGTGCT TAGTTTGNTAGTTCANTCTT TCCAGGGCAG TCTGGTGGAT 720 NACACCAAAT TGCTTTAGAA AATGCTAGNTCTACTGTCCC TGTCTATTGT CAGCTTTGCA 780 ATGTGCATAG TGACAGGAGT TGCCTGGGAGCTTGGGGCTT ATGTTTTGCA GATCCATTGT 840 AATTAAAAAA GAATTGTAAG GAGATGGAGGCACGGGGTGA GGG 883 892 base pairs nucleic acid double linear DNA(genomic) 75 GGGCCCCCCT CGAGGTCGAC GGTATCGATA AGCTTGATAT CGAATTCAGCTCTTAGCAAT 60 CTGACACCCT CTTCTGGCCT CTTCAGGCAC CTGCATGGTT CCACAGGACTGTCACACCCA 120 CGTACATAGA TAGTCAAAAT CTAGAGCACT GTTTCTATAC CTGTGAGTTGCAACCCCTTT 180 GGGAGTGCGG TCAAATGACC CTATCACAGG GGTCTCAAAT GAGATATCCTGCATATCAAA 240 TATTTACATT ATGATTCATA GTAGTACCAG AATTACAGTT ATGAAGTTACAAAATAATTT 300 TATAGCTGAG AGTCACCACA ACATGCATAA CTGTATTAAA ATGTTACAGCATTAGCAAGG 360 TTGAGAAATA CTGGTCTAGA GCCATTCCTT GTGCTGATAA AGGTGGCAGTGAGCATTATC 420 TTTCTGTCTC CACACCACTA GCAAATTTTT TCTCTATATA TAAACATGTAATATGAGACA 480 GTCTGAATCC ACTGAGGCAC GGTCTGACTC CAGAACAAAG GATCGTATTCCTGAAAAGCA 540 AAACGTGTGT TTGGCACTGA CTGTGTGNCC CAGGTTNTCT TTCTGNACTCCTAGAGGTCT 600 GTANTGGGTC TTGAAGCACA GATNCTCTAA CCTTACCCTG GNNGCTCAGTAGNATGCCCC 660 AAAACNCANG NTGTTCAACA TNGGGNNCCN CCCNGAAACA GNGNTGTNGGATTTGGNAGA 720 AAGGTGNAAT NCTTTGGGCN NNTCGGTTTA GGAATTTTAA ACANNAACTGGCTTNCNAGG 780 TCCNTTCCGG AGTCATCCTT NCACTGGNGC CCNCTGGACC CGGNGNANNGGGCCANTTCG 840 CCAGTTCGTN CCCCTGGNAC CCNTCNCCGG GGGCNAAANG CCCCTNNNNT TC892 884 base pairs nucleic acid double linear DNA (genomic) 76TGGGCCCCCC TCGAGGTCGA CGGTATCGAT AAGCTTGAGG GACCCACGTG ATGGAAAGGG 60AGAAGCAATT TAGTGTCCTT TGTCCTCTGA CCTCCACAAG TGCTGTGGCA TGGGGACACA 120GGACTGTACA CACACACACA CACACACACA CACACACACA CACACACGCA CGCACACACA 180CCCCTCAAGT AACCGTGGAA TAAAGGTCCG ACCAGAAACC ACGCTGGAAC GGGAGATGCT 240GGAGCACATC AGGGTGGTGC TAAGCAGCAG ATCGGCCTGT AACTGGCAGC AGAGGGGTGT 300GGCTCTTTCA GAACCAGGAG GGCATCGCCC CTCCAGCCAG ACTCTCCAGC TTTCTTCCCC 360TCCTTGCCTC CTGTTTTCCT TCTGCCTACC TTCCTTTGGC CTCAAACCAT AATGTGCAAC 420ACATTCAAAC TGTAGTAAGT GTTTTAATTT TCTACTAAAC AATAAAACCT TTAGATTTTC 480ACTGGGCCAG TGCTGGTAAC AGCAGACTGG GTGGAGTATC ACAGAGGGTG TGGAGCAAGC 540TGGCTACCCA GGGCTGGGCA CACTCAACAC TCTGGCATTC TGTGGAAGTT CTGGGCAGTA 600AAAACAGAAG CATACGTCAC GCACAGGTTC CATAGTGTTA GGCATCTTAA TCTATCTAGA 660ATACCTGGTG TTTAGTTTGT TTACAAAATT GATTGTTGTA CTTGGACAGT GGTGTTTTTT 720TCCCAGGGCT TCCAGGATTT AGGGGTATAC CAGGCCCATT ACATTGGGTA AACGTGTGTG 780TTAATTTTTT CTTTTTAAAC CTCCTTGGTT GACTACTTGT TTTCCTTTTT AATGGTCCCA 840GTTCCCCTTG GGGGGTTTGT TTTGGAAAAA GGCTTTCCGG TTTC 884 326 base pairsnucleic acid double linear DNA (genomic) 77 AGCACACCAC AGAGAGGGGGTCTCCGTGCC CGAGAGGCAA AAGTCTCCCA CTGTGCTCCT 60 CTCCCCCCCT GGTGGGGGTTAAGAGATGGG GGCTCTGGGG GGTGATAGAA CCCCTGGCGG 120 GACACCCCCC CGCTCTCGTGGAGAGAGACA GAGGGGGGTG CCCCTGATAT CTCACTAGAG 180 GGGAGAGGTG AGAGGGCTCCACAGTGTGGT GTGGTGGTGA GTGCTCTATC TCCAGGTGTC 240 TCACATATTT TCACAGCTCTTGACCACAGA GAGATCTTGT TGACTCTGTG CTCGCGGAAT 300 CTAATGTGCC CCACATCATATACACA 326 557 base pairs nucleic acid double linear DNA (genomic) 78GGGGGGGTCT CACNNTANAN CACTCNGGNG TCTCCCATGT CTAGATCTCC CCCCNGCNCN 60NGNGANGAGT GTGNGGAGAT CCCTCTCTGN TCTCTACACT CTAAAGGGTA NGCGGGGAGA 120GAGAGAGAGC ACANTCTATA GANCACANAG CACACNCGCT CNANGTGCCC NANTNACANG 180NNAGAGAGAN CCCCTCTCNC AGTATATNGG GGAGAGAGTN TGAGGGACNC TCCTCTTTTC 240TCTCAACNCT GNGGGGGGAG NGNGAGTGTT CTCTCTGNGG GGNGGAGNGG NACACTCNGN 300TCTNCGTNTG NGTGCNCNNG TNTTCTGGGG GTCACANAGA AATCNCCTNT CTCAACACAA 360CAACAACAAC CCCCCGCACG NGCACACACC ACAACAACAA NGGGACANCG CGNGGGGGNT 420NGNGCACACC CAGNGGAGAC ACTGTTTTCT GTTTNACACA CACACACACA CACACACACA 480CNCNCCCCCC ACANAGTTTT TNGGAAAANC GCNGGGGGGG GNGGGNCTTT TTGCCNCAAG 540CCTTTTTTNA NCNCCCA 557 376 base pairs nucleic acid double linear DNA(genomic) 79 GTCTCCCCCA AAGGGGGGGT CTCACCCTCC CGGACACCAC ACATCTGTCTGTCTCTCTGA 60 TCTCTGACAC CCCACAGAGA TATATATAGG GACAACGCCG CTGTCCCCATGATATAGAGA 120 GAAGCGAGAC AAACTCTCAG GTACACATGA CACATGATCC CCATGATCCCCGGCACACTC 180 TTCTAATATA GTTGAGAGAG TTGTGTCTCT CAAGTGTCTC TGGTATTTTCTAACCCCATG 240 TTTTCTCTCA CAATGTCACA CGGGGGAGCT CGGACGCGGT GCACATGGGGGAGAGTTCGT 300 GTCTATGACA CACTAGTCTT GCCCCCGAAC CACAGAGACC TCGACTCGGGTTTAGTCTCC 360 TCTGCCCCCC CAGCTC 376 533 base pairs nucleic acid doublelinear DNA (genomic) 80 ATNNCCCAAN ATCANATGNG GAANNNCCCA CATTTTNTATNTAGAAANGN GTTTTGTGTG 60 TGTGNGTNNA ATTTGAGNTT TCACAGAGNT NACATTCTCTGTGTCACAAN CCCTTTCTCT 120 CTACACTCCA CAGTGTGGTG NGAGATATAC TNTGANACANATGNGCTCTC TCCTCNCCCC 180 CCNNCATGTT NTNCCCCACA GTNTACNNCN NCNATATATNGNNCNCNGNA GANNGGTATG 240 NGNGNTGTNT TTNTTTAAAA AGATNTNANA NAGNGGGTATGCGTGNGGGG TATGTNNANA 300 CATATATGTN NNAGAGGGTC TCTCTGNGGC CCNATGGAGGCANATCCCCC CCNCTCNGAG 360 NNATATAGAA AAGAGTNTTT NANGGTGTTT GTGGACACAGATAAGGGGAG AGAGAGAGAG 420 AGAGANAGAG AGAGANAGAG AGAGAGAGAG AGAGAGANANGGNGTNTTNG GNTTCNTCCC 480 CCCCNATATA CAGAAAAANC GGGGGGGGGT TAGGNGGNNGGGGGTTTNCT TTA 533 346 base pairs nucleic acid double linear DNA(genomic) 81 TTTCACACGA GATGTCGCGA CTCTCGCGAG ACTCTCAGCG CGGAGATATAGACCCACAAG 60 GGGAATCCCC CGGGTTTTTT GCCACAGGAG AGCGCGAGGA GAGAGATATTCTTATTATGG 120 CTATAGACAC CCCCGTGGGT GGGGGACATT TGTGGTGTTT CCACAGGGGGGGGGATGTAC 180 CCCGGATATC AGAGTATTCT CTAAAAAAGG TGAGAAGAGG TCTTCTCTTTTGAGAGTATG 240 GGGACACTCG AGGAGAGCTC TCTATCTATC TCTCACAGCG CCCCTGTGTGGGCGGATCCT 300 CCACACCAGA TGTTAGTGTG NAGATCTCCC CATCTTCTAT ATTGAA 346461 base pairs nucleic acid double linear DNA (genomic) 82 GAANACCCAAAATTGNGCTN GTGGGCAAAN NTTTTNCCGT TTCTTGTGCT TGNGCGGCNA 60 AGNNAAAAATTCAAAACCAA NACCACANAA GCGCGTTATC CTGNCTNTCT GCCNTTNCCC 120 TGTCACACTGNGGCTGTACA GACATCNANC GCTTTCTAGA GAGACGNGAG AGTCAGGGGA 180 CTCTTTCCCCCANNCGCATT ATANCCACAT ATTAGNGTAN NANATTCAGC TGTGNTNCAC 240 TGGGNGTGTCTCCNTAGTGT GAAGCAACAC AGGGAAACTN TTCGCNCACA TGTCCTCTGG 300 TGTTCACAGANATAAGNAGG CTCCTAGACC NNTATNACTG TGGGNAGAGN ATGTTACCTC 360 CCTATANNTCGGGGTCTATC TCTGTGAGAN AGAGNTTCCT TTCTCCCATN CCTACCTCAG 420 TGGGGTGNTATNTACATCNC AGAGAGCAGA NAACTGTGAG C 461 367 base pairs nucleic aciddouble linear DNA (genomic) 83 GGGGTNTCAC AGAGANAGGG CACANCTCTCCCNAGAANGG GNCNNCCCTC TTTTTNNGGN 60 GTAACACCTC TCNCCGTGTC TCTTTCTTTCTTTTTTNTTT TTTGGGGGGC TCTTTTTCGN 120 GGAGGNGGAG NNCGNCCGAG GGTCGGGCNNNNCNGNGGAN AGCTCTNTCN CANNGATATA 180 TCNCCNNANC CCCCCTGTNT CTTATAANNNACATCTCTTC NTCNCAGGGT CACACCNAGA 240 NTCTCNTTTC TACAACAACC CCCACACGCNAAAGCTCCCC ACNNNGNGNG GGGGTCTCNC 300 AAGAANATCT CNGCGGAGAG GTGGNGGAGAGAGTGANATC TGNATNTCTG GNTTCCCCNC 360 ANTGCCC 367

What is claimed is:
 1. An isolated nucleic acid comprising a nucleotidesequence set forth in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ IDNO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ IDNO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ IDNO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ ID NO:22, SEQ IDNO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:26, SEQ ID NO:27, SEQ IDNO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, SEQ IDNO:33, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:37, SEQ IDNO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ ID NO:42, SEQ IDNO:43, SEQ ID NO:44, SEQ ID NO:45, SEQ ID NO:46, SEQ ID NO:47, SEQ IDNO:48, SEQ ID NO:49, SEQ ID NO:50, SEQ ID NO:51, SEQ ID NO:52, SEQ IDNO:53, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:56, SEQ ID NO:57, SEQ IDNO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID NO:62, SEQ IDNO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ ID NO:67, SEQ IDNO:68, SEQ ID NO:69, SEQ ID NO:70, SEQ ID NO:71, SEQ ID NO:72, SEQ IDNO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:77, SEQ IDNO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82 or SEQ IDNO:83.
 2. An allelic variant or homolog of the nucleic acid of claim 1.3. An isolated nucleic acid encoding the protein encoded by the genecomprising the nucleotide sequence set forth in SEQ ID NO:5, SEQ IDNO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11,SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16,SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21,SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:26,SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO:31,SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:36,SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41,SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ ID NO:45, SEQ ID NO:46,SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:50, SEQ ID NO:51,SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:56,SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61,SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:66,SEQ ID NO:67, SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:70, SEQ ID NO:71,SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76,SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81,SEQ ID NO:82 or SEQ ID NO:83.
 4. A host cell containing the nucleic acidof claim 1, 2 or
 3. 5. A nucleic acid that selectively hybridizes understringent conditions with the nucleic acid of claim 1, 2 or
 3. 6. Anucleic acid having a region within an exon wherein the region has atleast 50% homology with the nucleic acid of claim 1, 2 or
 3. 7. Anucleic acid having a region within an exon wherein the region has atleast 60% homology with the nucleic acid of claim 1, 2 or
 3. 8. Anucleic acid having a region within an exon wherein the region has atleast 70% homology with the nucleic acid of claim 1, 2 or
 3. 9. Anucleic acid having a region within an exon wherein the region has atleast 80% homology with the nucleic acid of claim 1, 2 or
 3. 10. Anucleic acid having a region within an exon wherein the region has atleast 90% homology with the nucleic acid of claim 1, 2 or
 3. 11. Anucleic acid having a region within an exon wherein the region has atleast 95% homology with the nucleic acid of claim 1, 2
 3. 12. A proteinencoded by the nucleic acid of claims 1, 2, 3, 5, 6, 7, 8, 9, 10 or 11.13. A nucleic acid comprising a regulatory region of a gene comprisingthe nucleotide sequence set forth in SEQ ID NO:5, SEQ ID NO:6, SEQ IDNO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ IDNO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ IDNO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ IDNO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:26, SEQ IDNO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO:31, SEQ IDNO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:36, SEQ IDNO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ IDNO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ ID NO:45, SEQ ID NO:46, SEQ IDNO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:50, SEQ ID NO:51, SEQ IDNO:52, SEQ ID NO:53, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:56, SEQ IDNO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ IDNO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ IDNO:67, SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:70, SEQ ID NO:71, SEQ IDNO:72, SEQ ID NO:73, SEQ ID NO:74, SEQ ID NO:75, SEQ ID NO:76, SEQ IDNO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ IDNO:82 or SEQ ID NO:83.
 14. A construct comprising a regulatory region ofclaim 13, wherein the regulatory region is functionally linked to areporter gene.
 15. A method of identifying a cellular gene necessary forviral growth in a cell and nonessential for cellular survival,comprising (a) transferring into a cell culture growing inserum-containing medium a vector encoding a selective marker genelacking a functional promoter, (b) selecting cells expressing the markergene, (c) removing serum from the culture medium, (d) infecting the cellculture with the virus, and (e) isolating from the surviving cells acellular gene within which the marker gene is inserted, therebyidentifying a gene necessary for viral growth in a cell and nonessentialfor cellular survival.
 16. A method of reducing or inhibiting a viralinfection in a subject, comprising administering to the subject anamount of a composition that inhibits expression or functioning of agene product encoded by a gene comprising the nucleic acid set forth inSEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ IDNO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:1O, SEQ ID NO:11,SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16,SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21,SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:26,SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO:31,SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:36,SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41,SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ ID NO:45, SEQ ID NO:46,SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:50, SEQ ID NO:51,SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:56,SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61,SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:66,SEQ ID NO:67, SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:70, SEQ ID NO:71,SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74 or SEQ ID NO:75, or a homologthereof, thereby treating the viral infection.
 17. The method of claim16, wherein the composition comprises an antibody that binds a proteinencoded by the gene.
 18. The method of claim 16, wherein the compositioncomprises an antibody that binds a receptor for a protein encoded by thegene.
 19. The method of claim 16, wherein the composition comprises anantisense RNA that binds an RNA encoded by the gene.
 20. The method ofclaim 16, wherein the composition comprises a nucleic acid functionallyencoding an antisense RNA that binds an RNA encoded by the gene.
 21. Amethod of reducing or inhibiting a viral infection in a subjectcomprising mutating ex vivo in a selected cell from the subject anendogenous gene comprising the nucleic acid set forth in SEQ ID NO:1,SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ IDNO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ IDNO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ IDNO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:21, SEQ IDNO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ ID NO:26, SEQ IDNO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ ID NO:31, SEQ IDNO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ ID NO:35, SEQ ID NO:36, SEQ IDNO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:41, SEQ IDNO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ ID NO:45, SEQ ID NO:46, SEQ IDNO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:50, SEQ ID NO:51, SEQ IDNO:52, SEQ ID NO:53, SEQ ID NO:54, SEQ ID NO:55, SEQ ID NO:56, SEQ IDNO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ IDNO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ ID NO:66, SEQ IDNO:67, SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:70, SEQ ID NO:71, SEQ IDNO:72, SEQ ID NO:73, SEQ ID NO:74 or SEQ ID NO:75, or a homolog thereof,to a mutated gene incapable of producing a functional gene product ofthe gene or to a mutated gene producing a reduced amount of a functionalgene product of the gene, and replacing the cell in the subject, therebyreducing viral infection of cells in the subject.
 22. The method ofclaim 21, wherein the cell is a hematopoietic cell.
 23. A method ofreducing or inhibiting a viral infection in a subject comprisingmutating ex vivo in a selected cell from the subject an endogenous genecomprising a nucleic acid isolated by the method of claim 15, to amutated gene incapable of producing a functional gene product of thegene or to a mutated gene producing a reduced amount of a functionalgene product of the gene, and replacing the cell in the subject, therebyreducing viral infection of cells in the subject.
 24. The method ofclaim 23, wherein the virus is HIV.
 25. The method of claim 23, whereinthe cell is a hematopoietic cell.
 26. A method of increasing viralinfection resistance in a subject comprising mutating ex vivo in aselected cell from the subject an endogenous gene comprising a nucleicacid isolated by the method of claim 15, to a mutated gene incapable ofproducing a functional gene product of the gene or to a mutated geneproducing a reduced amount of a functional gene product of the gene, andreplacing the cell in the subject, thereby reducing viral infection ofcells in the subject.
 27. The method of claim 26, wherein the virus isHIV.
 28. The method of claim 26, wherein the cell is a hematopoieticcell.
 29. A method of screening a compound for effectiveness in treatinga viral infection, comprising administering the compound to a cellcontaining a cellular gene functionally encoding a gene productnecessary for reproduction of the virus in the cell but not necessaryfor survival of the cell and detecting the level of the gene productproduced, a decrease or elimination of the gene product indicating acompound effective for treating the viral infection.
 30. The method ofclaim 29, wherein the cellular gene comprises the nucleic acid set forthin SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ IDNO:1, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ IDNO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ IDNO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:25, SEQ IDNO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ ID NO:30, SEQ IDNO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ ID NO:35, SEQ IDNO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40, SEQ IDNO:41, SEQ ID NO:42, SEQ ID NO:43, SEQ ID NO:44, SEQ ID NO:45, SEQ IDNO:46, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, SEQ ID NO:50, SEQ IDNO:51, SEQ ID NO:52, SEQ ID NO:53, SEQ ID NO:54, SEQ ID NO:55, SEQ IDNO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ IDNO:61, SEQ ID NO:62, SEQ ID NO:63, SEQ ID NO:64, SEQ ID NO:65, SEQ IDNO:66, SEQ ID NO:67, SEQ ID NO:68, SEQ ID NO:69, SEQ ID NO:70, SEQ IDNO:71, SEQ ID NO:72, SEQ ID NO:73, SEQ ID NO:74 or SEQ ID NO:75, or ahomolog thereof.
 31. The method of claim 29, wherein the cellular geneis a gene identified by the method of claim
 15. 32. A method ofscreening a compound for reducing or inhibiting a viral infection,comprising administering the compound to a cell containing the constructof claim 14 and detecting the level of the reporter gene productproduced, a decrease or elimination of the reporter gene productindicating a compound for reducing or inhibiting the viral infection.33. A purified mammalian serum protein having a molecular weight ofbetween about 50 kD and 100 kD which resists inactivation in low pH andresists inactivation by chloroform extraction, which inactivates whenboiled and inactivates in low ionic strength solution, and which whenremoved from a cell culture comprising cells persistently infected withreovirus selectively prevents survival of cells persistently infectedwith reovirus.
 34. A method of selectively eliminating, from an animalcell culture capable of surviving for a first period of time in theabsence of serum, cells persistently infected with a virus, comprisingpropagating the cell culture in the absence of serum for a second timeperiod which a persistently infected cell cannot survive without serum,thereby selectively eliminating from the cell culture cells persistentlyinfected with the virus.
 35. The method of claim 34, wherein the secondtime period is from about three days to about ten days.
 36. The methodof claim 34, further comprising transferring the cell culture from afirst container to a second container.
 37. A method of selectivelyeliminating from a cell culture cells persistently infected with avirus, comprising propagating the cell culture in the absence of afunctional form of the protein of claim
 33. 38. A method of reducing orinhibiting a viral infection in a subject, comprising administering tothe subject an amount of a composition that inhibits functioning of aserum protein having a molecular weight of between about 50 kD and 100kD which resists inactivation in low pH and resists inactivation bychloroform extraction, which inactivates when boiled and inactivates inlow ionic strength solution, and which, when removed from a cell culturecomprising cells persistently infected with the virus, prevents survivalof cells persistently infected with the virus, thereby reducing orinhibiting the viral infection.
 39. The method of claim 38, wherein thecomposition comprises an antibody that binds the serum protein.
 40. Themethod of claim 38, wherein the composition comprises an antisense RNAthat binds an RNA encoded by the gene.
 41. A method of identifying acellular gene that can suppress a malignant phenotype in a cell,comprising (a) transferring into a cell culture incapable of growingwell in soft agar a vector encoding a selective marker gene lacking afunctional promoter, (b) selecting cells expressing the marker gene, and(c) isolating from selected cells which are capable of growing in agar acellular gene within which the marker gene is inserted, therebyidentifying a gene that can suppress a malignant phenotype in a cell.42. A method of identifying a cellular gene that can suppress amalignant phenotype in a cell, comprising (a) transferring into a cellculture of non-transformed cells a vector encoding a selective markergene lacking a functional promoter, (b) selecting cells expressing themarker gene, and (c) isolating from selected and transformed cells acellular gene within which the marker gene is inserted, therebyidentifying a gene that can suppress a malignant phenotype in a cell.43. A method of screening for a compound for suppressing a malignantphenotype in a cell comprising administering the compound to a cellcontaining a cellular gene functionally encoding a gene product involvedin establishment of a malignant phenotype in the cell and detecting thelevel of the gene product produced, a decrease or elimination of thegene product indicating a compound effective for suppressing themalignant phenotype.
 44. A method of suppressing a malignant phenotypein a cell in a subject, comprising administering to the subject anamount of a composition that inhibits expression or functioning of agene product encoded by a gene comprising the nucleic acid set forth inSEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80,SEQ ID NO:81, SEQ ID NO:82 or SEQ ID NO:83, or a homolog thereof,thereby suppressing a malignant phenotype.
 45. The method of claim 44,wherein the composition comprises an antibody that binds a proteinencoded by the gene.
 46. The method of claim 44, wherein the compositioncomprises an antibody that binds a receptor for a protein encoded by thegene.
 47. The method of claim 44, wherein the composition comprises anantisense RNA that binds an RNA encoded by the gene.
 48. The method ofclaim 44, wherein the composition comprises a nucleic acid functionallyencoding an antisense RNA that binds an RNA encoded by the gene.